Mastering Machine Learning for Penetration Testing - Chiheb Chebbi - E-Book

Mastering Machine Learning for Penetration Testing E-Book

Chiheb Chebbi

0,0
36,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Become a master at penetration testing using machine learning with Python




Key Features



  • Identify ambiguities and breach intelligent security systems


  • Perform unique cyber attacks to breach robust systems


  • Learn to leverage machine learning algorithms





Book Description



Cyber security is crucial for both businesses and individuals. As systems are getting smarter, we now see machine learning interrupting computer security. With the adoption of machine learning in upcoming security products, it's important for pentesters and security researchers to understand how these systems work, and to breach them for testing purposes.






This book begins with the basics of machine learning and the algorithms used to build robust systems. Once you've gained a fair understanding of how security products leverage machine learning, you'll dive into the core concepts of breaching such systems. Through practical use cases, you'll see how to find loopholes and surpass a self-learning security system.






As you make your way through the chapters, you'll focus on topics such as network intrusion detection and AV and IDS evasion. We'll also cover the best practices when identifying ambiguities, and extensive techniques to breach an intelligent system.






By the end of this book, you will be well-versed with identifying loopholes in a self-learning security system and will be able to efficiently breach a machine learning system.





What you will learn



  • Take an in-depth look at machine learning


  • Get to know natural language processing (NLP)


  • Understand malware feature engineering


  • Build generative adversarial networks using Python libraries


  • Work on threat hunting with machine learning and the ELK stack


  • Explore the best practices for machine learning





Who this book is for



This book is for pen testers and security professionals who are interested in learning techniques to break an intelligent security system. Basic knowledge of Python is needed, but no prior knowledge of machine learning is necessary.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 177

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Mastering Machine Learning for Penetration Testing

 

 

Develop an extensive skill set to break self-learning systems using Python

 

 

 

 

 

 

 

 

 

Chiheb Chebbi

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Mastering Machine Learning for Penetration Testing

Copyright © 2018 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Vijin BorichaAcquisition Editor: Heramb BhavsarContent Development Editor: Nithin George VargheseTechnical Editor: Komal KarneCopy Editor: Safis EditingProject Coordinator: Virginia DiasProofreader: Safis EditingIndexer: Tejal Daruwale SoniGraphics: Tom ScariaProduction Coordinator: Aparna Bhagat

First published: June 2018

Production reference: 1260618

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78899-740-9

www.packtpub.com

I dedicate this book to every person who makes the security community awesome and fun!
mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

PacktPub.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the author

Chiheb Chebbi is an InfoSec enthusiast who has experience in various aspects of information security, focusing on the investigation of advanced cyber attacks and researching cyber espionage and APT attacks. Chiheb is currently pursuing an engineering degree in computer science at TEK-UP university in Tunisia. 

His core interests are infrastructure penetration testing, deep learning, and malware analysis. In 2016, he was included in the Alibaba Security Research Center Hall Of Fame. His talk proposals were accepted by DeepSec 2017, Blackhat Europe 2016, and many world-class information security conferences.

I would like to thank my parents and friends who have always been a great support. I'd like to extend my thanks to Packt folks, especially Nithin, Heramb, and Komal for giving me the opportunity to get involved in this book.

About the reviewer

Aditya Mukherjee is a proficient information security professional, cybersecurity speaker, entrepreneur, cybercrime investigator, and columnist.

He has 10+ years of experience in different leadership roles across information security domains with various reputed organizations, specializing in the implementation of cybersecurity solutions, cyber transformation projects, and solving problems associated with security architecture, framework, and policies.

 

 

 

 

 

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Mastering Machine Learning for Penetration Testing

Dedication

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Introduction to Machine Learning in Pentesting

Technical requirements

Artificial intelligence and machine learning  

Machine learning models and algorithms 

Supervised

Bayesian classifiers

Support vector machines

Decision trees 

Semi-supervised

Unsupervised

Artificial neural networks 

Linear regression 

Logistic regression

Clustering with k-means 

Reinforcement

Performance evaluation 

Dimensionality reduction

Improving classification with ensemble learning 

Machine learning development environments and Python libraries

NumPy

SciPy

TensorFlow

Keras

pandas

Matplotlib

scikit-learn

NLTK

Theano

Machine learning in penetration testing - promises and challenges

Deep Exploit

Summary

Questions

Further reading

Phishing Domain Detection

Technical requirements

Social engineering overview

Social Engineering Engagement Framework

Steps of social engineering penetration testing

Building real-time phishing attack detectors using different machine learning models

Phishing detection with logistic regression

Phishing detection with decision trees

NLP in-depth overview

Open source NLP libraries

Spam detection with NLTK

Summary

Questions

Malware Detection with API Calls and PE Headers

Technical requirements

Malware overview

Malware analysis      

Static malware analysis

Dynamic malware analysis

Memory malware analysis

Evasion techniques

Portable Executable format files 

Machine learning malware detection using PE headers 

Machine learning malware detection using API calls

Summary

Questions

Further reading

Malware Detection with Deep Learning

Technical requirements

Artificial neural network overview

Implementing neural networks in Python

Deep learning model using PE headers

Deep learning model with convolutional neural networks and malware visualization

Convolutional Neural Networks (CNNs)

Recurrent Neural Networks (RNNs)

Long Short Term Memory networks

Hopfield networks

Boltzmann machine networks

Malware detection with CNNs

Promises and challenges in applying deep learning to malware detection

Summary

Questions

Further reading

Botnet Detection with Machine Learning

Technical requirements

Botnet overview

Building a botnet detector model with multiple machine learning techniques

How to build a Twitter bot detector

Visualization with seaborn

Summary

Questions

Further reading

Machine Learning in Anomaly Detection Systems

Technical requirements

An overview of anomaly detection techniques

Static rules technique

Network attacks taxonomy

The detection of network anomalies

HIDS

NIDS

Anomaly-based IDS

Building your own IDS

The Kale stack

Summary

Questions

Further reading

Detecting Advanced Persistent Threats

Technical requirements

Threats and risk analysis

Threat-hunting methodology

The cyber kill chain

The diamond model of intrusion analysis

Threat hunting with the ELK Stack

Elasticsearch

Kibana

Logstash

Machine learning with the ELK Stack using the X-Pack plugin

Summary

Questions

Evading Intrusion Detection Systems

Technical requirements

Adversarial machine learning algorithms

Overfitting and underfitting

Overfitting and underfitting with Python

Detecting overfitting

Adversarial machine learning

Evasion attacks

Poisoning attacks

Adversarial clustering

Adversarial features

CleverHans

The AML library 

EvadeML-Zoo

Evading intrusion detection systems with adversarial network systems

Summary

Questions

Further reading

Bypassing Machine Learning Malware Detectors

Technical requirements

Adversarial deep learning

Foolbox

Deep-pwning

EvadeML

Bypassing next generation malware detectors with generative adversarial networks

The generator

The discriminator

MalGAN

Bypassing machine learning with reinforcement learning

Reinforcement learning

Summary

Questions

Further reading

Best Practices for Machine Learning and Feature Engineering

Technical requirements

Feature engineering in machine learning

Feature selection algorithms

Filter methods

Pearson's correlation

Linear discriminant analysis

Analysis of variance

Chi-square

Wrapper methods

Forward selection

Backward elimination

Recursive feature elimination

Embedded methods

Lasso linear regression L1

Ridge regression L2

Tree-based feature selection

Best practices for machine learning

Information security datasets

Project Jupyter

Speed up training with GPUs

Selecting models and learning curves

Machine learning architecture

Coding

Data handling

Business contexts

Summary

Questions

Further reading

Assessments

Chapter 1 – Introduction to Machine Learning in Pentesting 

Chapter 2 – Phishing Domain Detection

Chapter 3 – Malware Detection with API Calls and PE Headers 

Chapter 4 – Malware Detection with Deep Learning 

Chapter 5 – Botnet Detection with Machine Learning 

Chapter 6 – Machine Learning in Anomaly Detection Systems 

Chapter 7 – Detecting Advanced Persistent Threats 

Chapter 8 – Evading Intrusion Detection Systems with Adversarial Machine Learning

Chapter 9 – Bypass Machine Learning Malware Detectors

Chapter 10 – Best Practices for Machine Learning and Feature Engineering

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Currently, machine learning techniques are some of the hottest trends in information technology. They impact on every aspect of our lives, and they affect every industry and field. Machine learning is a cyber weapon for information security professionals. In this book, you will not only explore the fundamentals of machine learning techniques, but will also learn the secrets to building a fully functional machine learning security system; we will not stop at building defensive layers. We will explore how to attack machine learning models with adversarial learning. Mastering Machine Learning for Penetration Testing will provide educational as well as practical value.

Who this book is for

Mastering Machine Learning for Penetration Testing is for pen testers and security professionals who are interested in learning techniques for breaking an intelligent security system. A basic knowledge of Python is needed, but no prior knowledge of machine learning is necessary.

What this book covers

Chapter 1, Introduction to Machine Learning in Pentesting, introduces reader to the fundamental concepts of the different machine learning models and algorithms, in addition to learning how to evaluate them. It then shows us how to prepare a machine learning development environment using many data science Python libraries.

Chapter 2, Phishing Domain Detection, guides us on how to build machine learning models to detect phishing emails and spam attempts using different algorithms and natural language processing (NLP).

Chapter 3, Malware Detection with API Calls and PE Headers, explains the different approaches to analyzing malware and malicious software, and later introduces us to some different techniques for building a machine learning-based malware detector.

Chapter 4, Malware Detection with Deep Learning, extends what we learned in the previous chapter to explore how to build artificial neural networks and deep learning to detect malware.

Chapter 5, Botnet Detection with Machine Learning, demonstrates how to build a botnet detector using the previously discussed techniques and publicly available botnet traffic datasets.

Chapter 6, Machine Learning in Anomaly Detection Systems, introduces us to the most important terminologies in anomaly detection and guides us to build machine learning  anomaly detection systems.

Chapter 7, Detecting Advanced Persistent Threats, shows us how to build a fully working real-world threat hunting platform using the ELK stack, which is already loaded by machine learning capabilities.

Chapter 8, Evading Intrusion Detection Systems with Adversarial Machine Learning, demonstrates how to bypass machine learning systems using adversarial learning and studies some real-world cases, including bypassing next-generation intrusion detection systems.

Chapter 9, Bypass Machine Learning Malware Detectors, teaches us how to bypass machine learning-based malware detectors with adversarial learning and generative adversarial networks.

Chapter 10, Best Practices for Machine Learning and Feature Engineering, explores  different feature engineering techniques, in addition to introducing readers to machine learning best practices to build reliable systems.

To get the most out of this book

We assume that the readers of this book are familiar with basic information security concepts and Python programming. Some of the demonstrations in this book require more practice and online research to delve into the concepts discussed.

Always check the GitHub repository of this book to check for updated code if you encounter any bugs, typos, or errors.

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packtpub.com

.

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-Machine-Learning-for-Penetration-Testing. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it from https://www.packtpub.com/sites/default/files/downloads/MasteringMachineLearningforPenetrationTesting_ColorImages.pdf.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.

Introduction to Machine Learning in Pentesting

Currently, machine learning techniques are some of the hottest trends in information technology. They impact every aspect of our lives, and they affect every industry and field. Machine learning is a cyber weapon for information security professionals. In this book, readers will not only explore the fundamentals behind machine learning techniques, but will also learn the secrets to building a fully functional machine learning security system. We will not stop at building defensive layers; we will illustrate how to build offensive tools to attack and bypass security defenses. By the end of this book, you will be able to bypass machine learning security systems and use the models constructed in penetration testing (pentesting) missions. 

In this chapter, we will cover:

Machine learning models and algorithms

Performance evaluation metrics

Dimensionality reduction

Ensemble learning

Machine learning development environments and Python libraries

Machine learning in penetration testing – promises and challenges

Technical requirements

In this chapter, we are going to build a development environment. Therefore, we are going to install the following Python machine learning libraries:

NumPy

SciPy

TensorFlow

Keras

pandas

MatplotLib

scikit-learn

NLTK

Theano

You will also find all of the scripts and installation guides used in this GitHub repository: https://github.com/PacktPublishing/Mastering-Machine-Learning-for-Penetration-Testing/tree/master/Chapter01.

Artificial intelligence and machine learning  

Making a machine think like a human is one of the oldest dreams. Machine learning techniques are used to help make predictions based on experiences and data.

Machine learning models and algorithms 

In order to teach machines how to solve a large number of problems by themselves, we need to consider the different machine learning models. As you know, we need to feed the model with data; that is why machine learning models are divided, based on datasets entered (input), into four major categories: supervised learning, semi-supervised learning, unsupervised learning, and reinforcement. In this section, we are going to describe each model in a detailed way, in addition to exploring the most well-known algorithms used in every machine learning model. Before building machine learning systems, we need to know how things work underneath the surface.

Supervised

We talk about supervised machine learning when we have both the input variables and the output variables. In this case, we need to map the function (or pattern) between the two parties. The following are some of the most often used supervised machine learning algorithms.  

Bayesian classifiers

According to the Cambridge English Dictionary, bias is the action of supporting or opposing a particular person or thing in an unfair way, allowing personal opinions to influence your judgment. Bayesian machine learning refers to having a prior belief, and updating it later by using data. Mathematically, it is based on the Bayes formula:

One of the simplest Bayesian problems is randomly tossing a coin and trying to predict whether the output will be heads or tails. That is why we can identify Bayesian methodology as being probabilistic. Naive Bayes is very useful when you are using a small amount of data.

Support vector machines

A support vector machine (SVM) is a supervised machine learning model that works by identifying a hyperplane between represented data. The data can be represented in a multidimensional space. Thus, SVMs are widely used in classification models. In an SVM, the hyperplane that best separates the different classes will be used. In some cases, when we have different hyperplanes that separate different classes, identification of the correct one will be performed thanks to something called a margin, or a gap. The margin is the nearest distance between the hyperplanes and the data positions. You can take a look at the following representation to check for the margin:

The hyperplane with the highest gap will be selected. If we choose the hyperplane with the shortest margin, we might face misclassification problems later. Don't be distracted by the previous graph; the hyperplane will not always be linear. Consider a case like the following:

In the preceding situation, we can add a new axis, called the z axis, and apply a transformation using a kernel trick called a kernel function, where z=x^2+y^2. If you apply the transformation, the new graph will be as follows:

Now, we can identify the right hyperplane. The transformation is called a kernel