Python Digital Forensics Cookbook - Preston Miller - E-Book

Python Digital Forensics Cookbook E-Book

Preston Miller

0,0
41,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Over 60 recipes to help you learn digital forensics and leverage Python scripts to amplify your examinations

About This Book

  • Develop code that extracts vital information from everyday forensic acquisitions.
  • Increase the quality and efficiency of your forensic analysis.
  • Leverage the latest resources and capabilities available to the forensic community.

Who This Book Is For

If you are a digital forensics examiner, cyber security specialist, or analyst at heart, understand the basics of Python, and want to take it to the next level, this is the book for you. Along the way, you will be introduced to a number of libraries suitable for parsing forensic artifacts. Readers will be able to use and build upon the scripts we develop to elevate their analysis.

What You Will Learn

  • Understand how Python can enhance digital forensics and investigations
  • Learn to access the contents of, and process, forensic evidence containers
  • Explore malware through automated static analysis
  • Extract and review message contents from a variety of email formats
  • Add depth and context to discovered IP addresses and domains through various Application Program Interfaces (APIs)
  • Delve into mobile forensics and recover deleted messages from SQLite databases
  • Index large logs into a platform to better query and visualize datasets

In Detail

Technology plays an increasingly large role in our daily lives and shows no sign of stopping. Now, more than ever, it is paramount that an investigator develops programming expertise to deal with increasingly large datasets.

By leveraging the Python recipes explored throughout this book, we make the complex simple, quickly extracting relevant information from large datasets. You will explore, develop, and deploy Python code and libraries to provide meaningful results that can be immediately applied to your investigations. Throughout the Python Digital Forensics Cookbook, recipes include topics such as working with forensic evidence containers, parsing mobile and desktop operating system artifacts, extracting embedded metadata from documents and executables, and identifying indicators of compromise. You will also learn to integrate scripts with Application Program Interfaces (APIs) such as VirusTotal and PassiveTotal, and tools such as Axiom, Cellebrite, and EnCase.

By the end of the book, you will have a sound understanding of Python and how you can use it to process artifacts in your investigations.

Style and approach

Our succinct recipes take a no-frills approach to solving common challenges faced in investigations. The code in this book covers a wide range of artifacts and data sources. These examples will help improve the accuracy and efficiency of your analysis—no matter the situation.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 436

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Python Digital Forensics Cookbook

 

 

 

 

 

 

 

 

 

 

Effective Python recipes for digital investigations

 

 

 

 

 

 

 

 

 

 

Preston Miller
Chapin Bryce

 

 

 

BIRMINGHAM - MUMBAI

Python Digital Forensics Cookbook

 

Copyright © 2017 Packt Publishing

 

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

 

First published: September 2017

 

Production reference: 1220917

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-78398-746-7

 

www.packtpub.com

Credits

Authors

Preston Miller

Chapin Bryce

Copy Editor

Stuti Srivastava

Reviewer

Dr. Michael Spreitzenbarth

Project Coordinator

Virginia Dias

Commissioning Editor

Kartikey Pandey

Proofreader

Safis Editing

Acquisition Editor

Rahul Nair

Indexer

Aishwarya Gangawane

Content Development Editor

Sharon Raj

Graphics

Kirk D'Penha

Technical Editor

Prashant Chaudhari

Production Coordinator

Aparna Bhagat

About the Authors

Preston Miller is a consultant at an internationally recognized risk management firm. He holds an undergraduate degree from Vassar College and a master’s degree in Digital Forensics from Marshall University. While at Marshall, Preston unanimously received the prestigious J. Edgar Hoover Foundation’s Scientific Scholarship. He is a published author, recently of Learning Python for Forensics, an introductory Python Forensics textbook. Preston is also a member of the GIAC advisory board and holds multiple industry-recognized certifications in his field.

 

 

 

 

Chapin Bryce works as a consultant in digital forensics, focusing on litigation support, incident response, and intellectual property investigations. After studying computer and digital forensics at Champlain College, he joined a firm leading the field of digital forensics and investigations. In his downtime, Chapin enjoys working on side projects, hiking, and skiing (if the weather permits). As a member of multiple ongoing research and development projects, he has authored several articles in professional and academic publications.

 

About the Reviewer

Dr. Michael Spreitzenbarth, after finishing his diploma thesis with the major topic of mobile phone forensics, worked as a freelancer in the IT security sector for several years. In 2013, he finished his PhD at the University of Erlangen-Nuremberg in the field of Android forensics and mobile malware analysis. Since then, he has been working as a team lead in an internationally operating CERT.

Dr. Michael Spreitzenbarth's daily work deals with the security of mobile systems, forensic analysis of smartphones and suspicious mobile applications, as well as the investigation of security-related incidents within ICS environments. At the same time he is working on the improvement of mobile malware analysis techniques and research in the field of Android and iOS forensics as well as mobile application testing.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

 

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review.

If you'd like to join our team of regular reviewers, you can email us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

 

 

 

 

 

 

 

 

 

To my mother, Mary, whose love, courage, and guidance have had an indelible impact on me. I love you very much.
Preston Miller                                                    

                                                                                                        

 

 

 

 

This book is dedicated to the love of my life and my best friend, Alexa. Thank you for all of the love, support, and laughter.
Chapin Bryce                                                                            

                                                                                                

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Getting ready

How to do it…

How it works…

There's more…

See also

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Essential Scripting and File Information Recipes

Introduction

Handling arguments like an adult

Getting started

How to do it…

How it works…

There's more…

Iterating over loose files

Getting started

How to do it…

How it works…

There's more…

Recording file attributes

Getting started

How to do it…

How it works…

There's more…

Copying files, attributes, and timestamps

Getting started

How to do it…

How it works…

There's more…

Hashing files and data streams

Getting started

How to do it…

How it works…

There's more…

Keeping track with a progress bar

Getting started

How to do it…

How it works…

There's more…

Logging results

Getting started

How to do it…

How it works…

There’s more…

Multiple hands make light work

Getting started

How to do it…

How it works…

There's more…

Creating Artifact Report Recipes

Introduction

Using HTML templates

Getting started

How to do it...

How it works...

There's more...

Creating a paper trail

Getting started

How to do it...

How it works...

There's more...

Working with CSVs

Getting started

How to do it...

How it works...

There's more...

Visualizing events with Excel

Getting started

How to do it...

How it works...

Auditing your work

Getting started

How to do it...

How it works...

There's more...

A Deep Dive into Mobile Forensic Recipes

Introduction

Parsing PLIST files

Getting started

How to do it...

How it works...

There's more…

Handling SQLite databases

Getting started

How to do it...

How it works...

Identifying gaps in SQLite databases

Getting started

How to do it...

How it works...

See also

Processing iTunes backups

Getting started

How to do it...

How it works...

There's more...

Putting Wi-Fi on the map

Getting started

How to do it...

How it works...

Digging deep to recover messages

Getting started

How to do it...

How it works...

There's more…

Extracting Embedded Metadata Recipes

Introduction

Extracting audio and video metadata

Getting started

How to do it...

How it works...

There's more...

The big picture

Getting started

How to do it...

How it works...

There's more...

Mining for PDF metadata

Getting started

How to do it...

How it works...

There's more...

Reviewing executable metadata

Getting started

How to do it...

How it works...

There's more...

Reading office document metadata

Getting started

How to do it...

How it works...

Integrating our metadata extractor with EnCase

Getting started

How to do it...

How it works...

There's more...

Networking and Indicators of Compromise Recipes

Introduction

Getting a jump start with IEF

Getting started

How to do it...

How it works...

Coming into contact with IEF

Getting started

How to do it...

How it works...

Beautiful Soup

Getting started

How to do it...

How it works...

There's more...

Going hunting for viruses

Getting started

How to do it...

How it works...

Gathering intel

Getting started

How to do it...

How it works...

Totally passive

Getting started

How to do it...

How it works...

Reading Emails and Taking Names Recipes

Introduction

Parsing EML files

Getting started

How to do it...

How it works...

Viewing MSG files

Getting started

How to do it...

How it works...

There’s more...

See also

Ordering Takeout

Getting started

How to do it...

How it works...

There’s more...

What’s in the box?!

Getting started

How to do it...

How it works...

Parsing PST and OST mailboxes

Getting started

How to do it...

How it works...

There’s more...

See also

Log-Based Artifact Recipes

Introduction

About time

Getting started

How to do it...

How it works...

There's more...

Parsing IIS web logs with RegEx

Getting started

How to do it...

How it works...

There's more...

Going spelunking

Getting started

How to do it...

How it works...

There's more...

Interpreting the daily.out log

Getting started

How to do it...

How it works...

Adding daily.out parsing to Axiom

Getting started

How to do it...

How it works...

Scanning for indicators with YARA

Getting started

How to do it...

How it works...

Working with Forensic Evidence Container Recipes

Introduction

Opening acquisitions

Getting started

How to do it...

How it works...

Gathering acquisition and media information

Getting started

How to do it...

How it works...

Iterating through files

Getting started

How to do it...

How it works...

There's more...

Processing files within the container

Getting started

How to do it...

How it works...

Searching for hashes

Getting started

How to do it...

How it works...

There's more...

Exploring Windows Forensic Artifacts Recipes - Part I

Introduction

One man's trash is a forensic examiner's treasure

Getting started

How to do it...

How it works...

A sticky situation

Getting started

How to do it...

How it works...

Reading the registry

Getting started

How to do it...

How it works...

There's more...

Gathering user activity

Getting started

How to do it...

How it works...

There's more...

The missing link

Getting started

How to do it...

How it works...

There's more...

Searching high and low

Getting started

How to do it...

How it works...

There's more...

Exploring Windows Forensic Artifacts Recipes - Part II

Introduction

Parsing prefetch files

Getting started

How to do it...

How it works...

There's more...

A series of fortunate events

Getting started

How to do it...

How it works...

There's more...

Indexing internet history

Getting started

How to do it...

How it works...

There's more...

Shadow of a former self

Getting started

How to do it...

How it works...

There's more...

Dissecting the SRUM database

Getting started

How to do it...

How it works...

There's more...

Conclusion

Preface

At the outset of this book, we strove to demonstrate a nearly endless corpus of use cases for Python in today’s digital investigations. Technology plays an increasingly large role in our daily life and shows no signs of stopping. Now, more than ever, it is paramount that an investigator develop programming expertise to work with increasingly large datasets. By leveraging the Python recipes explored throughout this book, we make the complex simple, efficiently extracting relevant information from large data sets. You will explore, develop, and deploy Python code and libraries to provide meaningful results that can be immediately applied to your investigations.

Throughout the book, recipes include topics such as working with forensic evidence containers, parsing mobile and desktop operating system artifacts, extracting embedded metadata from documents and executables, and identifying indicators of compromise. You will also learn how to integrate scripts with Application Program Interfaces (APIs) such as VirusTotal and PassiveTotal, and tools, such as Axiom, Cellebrite, and EnCase. By the end of the book, you will have a sound understanding of Python and will know how you can use it to process artifacts in your investigations.

What this book covers

Chapter 1, Essential Scripting and File Information Recipes, introduces you to the conventions and basic features of Python used throughout the book. By the end of the chapter, you will create a robust and useful data and metadata preservation script.

Chapter 2, Creating Artifact Report Recipes, demonstrates practical methods of creating reports with forensic artifacts. From spreadsheets to web-based dashboards, we show the flexibility and utility of various reporting formats.

Chapter 3, A Deep Dive into Mobile Forensic Recipes, features iTunes' backup processing, deleted SQLite database record recovery, and mapping Wi-Fi access point MAC addresses from Cellebrite XML reports.

Chapter 4, Extracting Embedded Metadata Recipes, exposes common file types containing embedded metadata and how to extract it. We also provide you with knowledge of how to integrate Python scripts with the popular forensic software, EnCase.

Chapter 5, Networking and Indicators of Compromise Recipes, focuses on network and web-based artifacts and how to extract more information from them. You will learn how to preserve data from websites, interact with processed IEF results, create hash sets for X-Ways, and identify bad domains or IP addresses.

Chapter 6, Reading Emails and Taking Names Recipes, explores the many file types for both individual e-mail messages and entire mailboxes, including Google Takeout MBox, and how to use Python for extraction and analysis.

Chapter 7, Log-Based Artifact Recipes, illustrates how to process artifacts from several log formats, such as IIS, and ingest them with Python info reports or other industry tools, such as Splunk. You will also learn how to develop and use Python recipes to parse files and create artifacts within Axiom.

Chapter 8, Working with Forensic Evidence Container Recipes, shows off the basic forensic libraries required to interact and process forensic evidence containers, including EWF and raw formats. You will learn how to access data from forensic containers, identify disk partition information, and iterate through filesystems.

Chapter 9, Exploring Windows Forensic Artifacts Recipes Part I, leverages the framework developed in Chapter 8, Working with Forensic Evidence Container Recipes, to process various Windows artifacts within forensic evidence containers. These artifacts include $I Recycle Bin files, various Registry artifacts, LNK files, and the Windows.edb index.

Chapter 10, Exploring Windows Forensic Artifacts Recipes Part II, continues to leverage the framework developed in Chapter 8, Working with Forensic Evidence Container Recipes, to process more Windows artifacts within forensic evidence containers. These artifacts include Prefetch files, Event logs, Index.dat, Volume Shadow Copies, and the Windows 10 SRUM database.

What you need for this book

In order to follow along with and execute the recipes within this cookbook, use a computer with an Internet connection and the latest Python 2.7 and Python 3.5 installations. Recipes may require additional third-party libraries to be installed; instructions for doing that are provided in the recipe.

For ease of development and implementation of these recipes, it is recommended that you set up and configure an Ubuntu virtual machine for development. These recipes, unless otherwise noted, were built and tested within an Ubuntu 16.04 environment with both Python 2.7 and 3.5. Several recipes will require the use of a Windows operating system, as many forensic tools operate only on this platform.

Who this book is for

If you are a digital forensics examiner, cyber security specialist, or analyst at heart that understands the basics of Python and want to take it to the next level, this is the book for you. Along the way, you will be introduced to a number of libraries suited for parsing forensic artifacts. You will be able to use and build upon the scripts we develop in order to elevate their analysis

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it…, How it works…, There's more…, and See also).

To give clear instructions on how to complete a recipe, we use these sections as follows:

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it…

This section contains the steps required to follow the recipe.

How it works…

This section usually consists of a detailed explanation of what happened in the previous section.

There's more…

This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "We can gather the required information by calling the get_data() function."

A block of code is set as follows:

def hello_world(): print(“Hello World!”)hello_world()

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

def hello_world():

print(“Hello World!”)

hello_world()

Any command-line input or output is written as follows:

# pip install tqdm==4.11.2

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Select System info from the Administration panel."

Warnings or important notes appear like this.
Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply email [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.

Hover the mouse pointer on the

SUPPORT

tab at the top.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on

Code Download

.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Python-Digital-Forensics-Cookbook. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/PythonDigitalForensicsCookbook_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously.

If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Essential Scripting and File Information Recipes

The following recipes are covered in this chapter:

Handling arguments like an adult

Iterating over loose files

Recording file attributes

Copying files, attributes, and timestamps

Hashing files and data streams

Keeping track with a progress bar

Logging results

Multiple hands make light work

Introduction

Digital forensics involves the identification and analysis of digital media to assist in legal, business, and other types of investigations. Oftentimes, results stemming from our analysis have a major impact on the direction of an investigation. With Moore’s law more or less holding true, the amount of data we are expected to review is steadily growing. Given this, it’s a foregone conclusion that an investigator must rely on some level of automation to effectively review evidence. Automation, much like a theory, must be thoroughly vetted and validated so as not to allow for falsely drawn conclusions. Unfortunately, investigators may use a tool to automate some process but not fully understand the tool, the underlying forensic artifact, or the output’s significance. This is where Python comes into play.

In Python Digital Forensics Cookbook, we develop and detail recipes covering a number of typical scenarios. The purpose is to not only demonstrate Python features and libraries for those learning the language but to also illustrate one of its great benefits: namely, a forced basic understanding of the artifact. Without this understanding, it is impossible to develop the code in the first place, thereby forcing you to understand the artifact at a deeper level. Add to that the relative ease of Python and the obvious benefits of automation, and it is easy to see why this language has been adapted so readily by the community.

One method of ensuring that investigators understand the product of our scripts is to provide meaningful documentation and explanation of the code. Hence the purpose of this book. The recipes demonstrated throughout show how to configure argument parsing that is both easy to develop and simple for the user to understand. To add to the script's documentation, we will cover techniques to effectively log the process that was taken and any errors encountered by the script.

Another unique feature of scripts designed for digital forensics is the interaction with files and their associated metadata. Forensic scripts and applications require the accurate retrieval and preservation of file attributes, including dates, permissions, and file hashes. This chapter will cover methods to extract and present this data to the examiner.

Interaction with the operating system and files found on attached volumes are at the core of any script designed for use in digital forensics. During analysis, we need to access and parse files with a wide variety of structures and formats. For this reason, it's important to accurately and properly handle and interact with files. The recipes presented in this chapter cover common libraries and techniques that will continue to be used throughout the book:

Parsing command-line arguments

Recursively iterating over files and folders

Recording and preserving file and folder metadata

Generating hash values of files and other content

Monitoring code with progress bars

Logging recipe execution information and errors

Improving performance with multiprocessing

Visitwww.packtpub.com/books/content/supportto download the code bundle for this chapter.

Handling arguments like an adult

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Any

Person A: I came here for a good argument! Person B: Ah, no you didn't, you came here for an argument! Person A: An argument isn't just contradiction. Person B: Well! it can be! Person A: No it can't! An argument is a connected series of statements intended to establish a proposition. Person B: No it isn't! Person A: Yes it is! It isn't just contradiction.

Monty Python (http://www.montypython.net/scripts/argument.php) aside, arguments are an integral part of any script. Arguments allow us to provide an interface for users to specify options and configurations that change the way the code behaves. Effective use of arguments, not just contradictions, can make a tool more versatile and a favorite among examiners.

Getting started

All libraries used in this script are present in Python's standard library. While there are other argument-handling libraries available, such as optparse and ConfigParser, our scripts will leverage argparse as our de facto command-line handler. While optparse was the library to use in prior versions of Python, argparse has served as the replacement for creating argument handling code. The ConfigParser library parses arguments from a configuration file instead of the command line. This is useful for code that requires a large number of arguments or has a significant number of options. We will not cover ConfigParser in this book, though it is worth exploring if you find your argparse configuration becomes difficult to maintain.

To learn more about the argparse library, visit https://docs.python.org/3/library/argparse.html.

How to do it…

In this script, we perform the following steps:

Create positional and optional arguments.

Add descriptions to arguments.

Configure arguments with select choices.

There's more…

This script can be further improved. We have provided a couple of recommendations here:

Explore additional

argparse

functionality. For example, the

argparse.FileType

object can be used to accept a

File

object as an input.

We can also use the

argparse.ArgumentDefaultsHelpFormatter

class to show defaults we set to the user. This is helpful when combined with optional arguments to show the user what will be used if nothing is specified.

Iterating over loose files

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Any

Often it is necessary to iterate over a directory and its subdirectories to recursively process all files. In this recipe, we will illustrate how to use Python to walk through directories and access files within them. Understanding how you can recursively navigate a given input directory is key as we frequently perform this exercise in our scripts.

Getting started

All libraries used in this script are present in Python's standard library. The preferred library, in most situations, for handling file and folder iteration is the built-in os library. While this library supports many useful operations, we will focus on the os.path() and os.walk() functions. Let’s use the following folder hierarchy as an example to demonstrate how directory iteration works in Python:

SecretDocs/|-- key.txt|-- Plans| |-- plans_0012b.txt| |-- plans_0016.txt| `-- Successful_Plans| |-- plan_0001.txt| |-- plan_0427.txt| `-- plan_0630.txt|-- Spreadsheets| |-- costs.csv| `-- profit.csv`-- Team |-- Contact18.vcf |-- Contact1.vcf `-- Contact6.vcf4 directories, 11 files

How to do it…

The following steps are performed in this recipe:

Create a positional argument for the input directory to scan.

Iterate over all subdirectories and print file paths to the console.

There's more…

This script can be further improved. Here's a recommendation:

Check out and implement similar functionality using the

glob

library which, unlike the

os

module, allows for wildcard pattern recursive searches for files and directories

Recording file attributes

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Any

Now that we can iterate over files and folders, let’s learn to record metadata about these objects. File metadata plays an important role in forensics, as collecting and reviewing this information is a basic task during most investigations. Using a single Python library, we can gather some of the most important attributes of files across platforms.

Getting started

All libraries used in this script are present in Python’s standard library. The os library, once again, can be used here to gather file metadata. One of the most helpful methods for gathering file metadata is the os.stat() function. It's important to note that the stat() call only provides information available with the current operating system and the filesystem of the mounted volume. Most forensic suites allow an examiner to mount a forensic image as a volume on a system and generally preserve the file attributes available to the stat call. In Chapter 8, Working with Forensic Evidence Containers Recipes, we will demonstrate how to open forensic acquisitions to directly extract file information.

To learn more about the os library, visit https://docs.python.org/3/library/os.html.

How to do it…

We will record file attributes using the following steps:

Obtain the input file to process.

Print various metadata: MAC times, file size, group and owner ID, and so on.

There's more…

This script can be further improved. We have provided a couple of recommendations here:

Integrate this recipe with the

Iterating over loose files

recipe to recursively extract metadata for files in a given series of directories

Implement logic to filter by file extension, date modified, or even file size to only collect metadata information on files matching the desired criteria

Copying files, attributes, and timestamps

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Windows

Preserving files is a fundamental task in digital forensics. It is often preferable to containerize files in a format that can store hashes and other metadata of loose files. However, sometimes we need to copy files in a forensic manner from one location to another. Using this recipe, we will demonstrate some of the methods available to copy files while preserving common metadata fields.

Getting started

This recipe requires the installation of two third-party modulespywin32andpytz.All other libraries used in this script are present in Python's standard library. This recipe will primarily use two libraries, the built-in shutil and a third-party library, pywin32. The shutil library is our go-to for copying files within Python, and we can use it to preserve most of the timestamps and other file attributes. The shutil module, however, is unable to preserve the creation time of files it copies. Rather, we must rely on the Windows-specific pywin32 library to preserve it. While the pywin32 library is platform specific, it is incredibly useful to interact with the Windows operating system.

To learn more about the shutil library, visit https://docs.python.org/3/library/shutil.html.

To install pywin32, we need to access its SourceForge page at https://sourceforge.net/projects/pywin32/ and download the version that matches our Python installation. To check our Python version, we can import the sys module and call sys.version within an interpreter. Both the version and the architecture are important when selecting the correct pywin32 installer.

To learn more about the sys library, visit https://docs.python.org/3/library/sys.html.

In addition to the installation of the pywin32 library, we need to install pytz, a third-party library used to manage time zones in Python. We can install this library using the pip command:

pip install pytz==2017.2

How to do it…

We perform the following steps to forensically copy files on a Windows system:

Gather source file and destination arguments.

Use

shutil

to copy and preserve most file metadata.

Manually set timestamp attributes with

win32file

.

There's more…

This script can be further improved. We have provided a couple of recommendations here:

Hash the source and destination files to ensure they were copied successfully. Hashing files are introduced in the hashing files and data streams recipe in the next section.

Output a log of the files copied and any exceptions encountered during the copying process.

Hashing files and data streams

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Any

File hashes are a widely accepted identifier for determining file integrity and authenticity. While some algorithms have become vulnerable to collision attacks, the process is still important in the field. In this recipe, we will cover the process of hashing a string of characters and a stream of file content.

Getting started

All libraries used in this script are present in Python’s standard library. For generating hashes of files and other data sources, we implement the hashlib library. This built-in library has support for common algorithms, such as MD5, SHA-1, SHA-256, and more. As of the writing of this book, many tools still leverage the MD5 and SHA-1 algorithms, though the current recommendation is to use SHA-256 at a minimum. Alternatively, one could use multiple hashes of a file to further decrease the odds of a hash collision. While we'll showcase a few of these algorithms, there are other, less commonly used, algorithms available.

To learn more about the hashlib library, visit https://docs.python.org/3/library/hashlib.html.

How to do it…