91,99 €
INTELLIGENT SECURITY SYSTEMS Dramatically improve your cybersecurity using AI and machine learning In Intelligent Security Systems, distinguished professor and computer scientist Dr. Leon Reznik delivers an expert synthesis of artificial intelligence, machine learning and data science techniques, applied to computer security to assist readers in hardening their computer systems against threats. Emphasizing practical and actionable strategies that can be immediately implemented by industry professionals and computer device's owners, the author explains how to install and harden firewalls, intrusion detection systems, attack recognition tools, and malware protection systems. He also explains how to recognize and counter common hacking activities. This book bridges the gap between cybersecurity education and new data science programs, discussing how cutting-edge artificial intelligence and machine learning techniques can work for and against cybersecurity efforts. Intelligent Security Systems includes supplementary resources on an author-hosted website, such as classroom presentation slides, sample review, test and exam questions, and practice exercises to make the material contained practical and useful. The book also offers: * A thorough introduction to computer security, artificial intelligence, and machine learning, including basic definitions and concepts like threats, vulnerabilities, risks, attacks, protection, and tools * An exploration of firewall design and implementation, including firewall types and models, typical designs and configurations, and their limitations and problems * Discussions of intrusion detection systems (IDS), including architecture topologies, components, and operational ranges, classification approaches, and machine learning techniques in IDS design * A treatment of malware and vulnerabilities detection and protection, including malware classes, history, and development trends Perfect for undergraduate and graduate students in computer security, computer science and engineering, Intelligent Security Systems will also earn a place in the libraries of students and educators in information technology and data science, as well as professionals working in those fields.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 818
Veröffentlichungsjahr: 2021
IEEE Press445 Hoes LanePiscataway, NJ 08854
IEEE Press Editorial BoardEkram Hossain, Editor in Chief
Jón Atli Benediktsson
Xiaoou Li
Jeffrey Reed
Anjan Bose
Lian Yong
Diomidis Spinellis
David Alan Grier
Andreas Molisch
Sarah Spurgeon
Elya B. Joffe
Saeid Nahavandi
Ahmet Murat Tekalp
Leon Reznik
Rochester Institute of Technology, Rochester, New York, USA
Copyright © 2022 by The Institute of Electrical and Electronics Engineers, Inc.All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty:While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging‐in‐Publication Data Applied for :
ISBN: 9781119771531
Cover Design: WileyCover Image: © AerialPerspective Works/E+/Getty Images
To my family, Alexandra, Dmitry, Michelle, and my students
Many organizations and individuals helped this book to appear, including my colleagues, students, editorial staff, family, and friends.I would like to thank all of you but have to limit the list of names.Thank you very much:Adam, Adrian, Adwait, Aileen, Akhil, Akshay, Alex, Alok, Amit, Arpit, Ashwin, Andrew, Andrey, Ankan, Anna, Anthony, Asif, Ayush, Benjamin, Brian, Carl, Chinmay, Christian, Darrell, Devang, Dhaval, Dhivya, Dileep, Dinesh, Dmitry, Elisa, Forum, Gaurav, George, Greg, Howard, Igor, James, Jeffrey, Jeton, Jinesh, Jody, Joe, Josh, Juan, Juliet, Justin, Karl, Karteek, Krishna, Kurt, Maninder, Mansha, Matthew, Michael, Michelle, Milan, Mohammed, Mohan, Ninad, Ninel, Olga, Omar, Parinitha, Parth, Pooja, Praful, Punit, Qiaoran, Raja, Ravina, Renzil, Richard, Rishi, Robert, Rohan, Rohit, Roman, Ron, Sagar, Sahil, Salil, Samir, Sanjay, Sandhya, Saransh, Saurabh, Scott, Sergey, Shashank, Shravya, Simran, Stanislaw, Sudhish, Suraj, Suresh, Swati, Tayeb, Tejas, Utsav, Vanessa, Virendra, and Vladik.
At last but not at least, I want to acknowledge that some research reported in this book was supported in part by the following recent grants provided by:National Science Foundation (award # ACI‐1547301),National Security Agency (award # H98230‐I7‐l‐0200), andUS Military Academy/DoD (award # W911NF2010337).
This book’s main goal is to provide help to its readers and users:
students in computer security, science, engineering, IT and information systems‐related fields, both undergraduate and graduate, looking for a textbook to gain the knowledge and skills at the intersection of computer security and artificial intelligence, machine learning, and data science domains;
their instructors at the universities, colleges, and institutions of higher education looking for a textbook and curriculum materials (review and test questions, notes, exercises, slides) to use in developing new and modifying existing courses;
professionals in the computer security area looking for a reference book to upgrade their skills and better understand intelligent techniques;
professionals and researchers in the field of artificial intelligence and data science looking for advice on where and how to apply their knowledge and skills in computer security domain.
While reader’s general background in computing, networking, security, and artificial intelligence is desirable, the book is self‐contained and starts with a review of computer security and intelligent techniques that should provide a sufficient foundation for further study.
This book aims at helping its readers to better understand how to apply artificial intelligence, machine learning, and data science in the computer security domain. It will introduce readers into the current state of an application of intelligent methodologies in computer security and information assurance systems design. As the design and operation of most of computer security systems and tools are based on an application of intelligent techniques, gaining deeper understanding and practical skills in this field would allow the readers to get better prepared either to enter the workforce or to upgrade their skills. The book merges the most advanced methodologies of artificial intelligence and machine learning with their applications in cybersecurity. The readers will gain knowledge in the hottest area of the current computer science and will be able to employ it in solving cybersecurity problems.
Unfortunately, currently there exists a gap between computer security practice, where professionals mostly employ various tools, often without a deep understanding of their design and functionality principles and comprehension of computer science methods and algorithms in general, and artificial intelligence, machine learning techniques, and data science in particular. The students and even the professionals do not realize that most tools they employ in computer security have been designed based on an application of intelligent methodologies. This knowledge lack does not let them design better tools and even employ existing ones more effectively and efficiently. The unique approach of this book is that it is designated to fill this gap by concentrating on the design features of computer security tools and mechanisms on one hand and discussing how intelligent procedures are employed in the industrial practice.
This book idea is innovative and unique. It merges together various knowledge areas as diverse as artificial intelligence and machine learning techniques and computer security systems and applications. By going across traditional border lines between various disciplines, it will allow the readers to acquire a unique knowledge in the very intense knowledge domain intersecting intelligent methods with computer security applications and to become much better prepared for computer security practice challenges. It aims at developing both theoretical knowledge as well as research and practical skills.
The book doubles as both a textbook and a reference book. From the education perspective, the book bridges education in cybersecurity domain with computer science and new data science programs, helping to advance all of them together. The content ranges from an explanation of basic concepts to the brief description of available tools. The writing style includes a traditional narrative as well as formulating and answering essential questions that will guide the presentation. The questions will help in self‐education as well as will assist instructors who might like to use them in their courses to get better prepared for possible student’s inquires. The book includes exercises. Slides will be available on the author’s website, https://www.cs.rit.edu/~lr/. Instructors will be provided with the list of suggested test and exam questions.
The book is oriented toward computer security practice, not its mathematical foundations. The book will teach how to design the prolific computer security systems and tools such as firewalls, intrusion detection systems, anti‐malware protection systems, hacking activities, and attacks recognition tools. The readers will gain deeper understanding of those systems and tools design. While discussing machine learning and data science algorithms, it does not go deep in mathematical details but prefers concentrating on possible applications.
Some other manuscripts claim to provide a comprehensive coverage of either the computer security or the artificial intelligence, machine learning, or data science domain. With both domain’s extremely wide content areas, this book is not aiming at the full review of two of the currently hottest areas in modern engineering and technology. Instead, the book is fully devoted to the exposure of applications of artificial intelligence, machine learning, and data science in the design and analysis of computer security systems, mechanisms, and tools as well as solving other security problems. It will discuss an application of intelligent techniques in firewalls, intrusion detection, malware detection, hacking activity recognition, and system security evaluation. It will review various attacks against computer security, ranging from simple phishing inquires to sophisticated attacks against intelligent classifiers based on machine learning techniques. While not giving 100% exposure of computer security or artificial intelligence domains, the book will deal with the most important growing areas of both fields. And the coverage ratio will increase as a bigger and bigger part of real computer security activities becomes stronger and stronger dependent on the artificial intelligence. With this knowledge, the readers will become frontrunners in the design of novel cybersecurity tools and mechanisms needed to protect computer networks and systems and national infrastructure.
The book consists of six big chapters (see Figure I.1) covering the specialized topics including:
review of the modern state of the computer security and artificial intelligence, machine learning, and data science applications in the area;
firewall design;
intrusion detection systems;
anti‐malware methods and tools;
hacking activities, attack recognition, and prevention;
adversarial attacks against AI‐based computer security tools and systems.
Figure I.1 Book organization.
The book will be accompanied by presentation slides as well as samples of exercises, test and exam questions, research, and tool assignments.
From the computer security perspective, the book moves a reader from reviewing the current situation through the traditional first line of defense (firewalls) and the second line of defense (intrusion detection systems) to the discussion of the modern malware families and anti‐malware protection and toward hacker’s and ordinary user’s profiles and typical activities with finishing up by discussing the privacy protection systems and adversarial attacks using machine learning techniques.
From the artificial intelligence perspective, the book starts with the review of artificial intelligence, machine learning, and data science techniques and technologies, then discusses the logic of the rules‐based and expert systems, and proceeds with machine learning and data science applications in the computer security domain. It presents multiple algorithms and methods, especially focusing on artificial neural networks, including shallow learning models, deep learning procedures, and generative adversarial networks.
While the book content covers major security mechanisms as well as intelligent techniques they employ, they are distributed over all chapters. In respect to the techniques generally, the book moves from older (and possibly, simpler) methods to newer (and possibly, more sophisticated) ones. However, each chapter is self‐contained and could be studied separately from others.
In particular:
Chapter 1 discusses the basic concepts of computer security as well as the taxonomy and classification of the fundamental algorithms in the domains of artificial intelligence, machine learning, and data science in relation to their applications in computer security. It reviews the sources of security threats and the attacks, concentrating on the area of IoT and wireless devices, as well as examines the possible protection mechanisms and tools. The module provides a general classification of intelligent approaches and their relationship to various computer security fields. It focuses on an introduction of the major intelligent techniques and technologies in computer security, such as expert systems, fuzzy logic, machine learning, artificial neural networks, and genetic algorithms. While presenting multiple techniques, the text emphasizes their advantage in comparison to each other as well as the obstacles in their further progress. Short algorithm descriptions and code examples are included.
Chapter 2 introduces a firewall as the first line of defense mechanism. It provides its definition, discusses the functions, possible architectures, and operational models, concentrating on presentation of their advantages and drawbacks. It includes the step‐by‐step guide to firewall design and implementation process ranging from planning to deployment and maintenance. The major emphasis in this chapter is placed on using rules to set up, configure, and modify the firewall’s policy. Both generic and specific rules are discussed as well as their formulation and editing with firewall tools. Substantial rules design principles and conflict avoidance and resolution are presented.
Chapter 3 develops knowledge and practical skills on intrusion detection and prevention systems (IDS) design, their analysis, implementation, and use. It presents IDS definition, discusses their goals and functions as well as their progress from the historical perspective. It advances reader’s design and analysis skills in the computer security domain by discussing artificial intelligence and machine learning techniques and their application in IDS design and implementation as well as in classifying IDS systems, evaluating an IDS performance, choosing the IDS design tools and employing them in practical design exercise. Algorithm and code examples are provided.
Chapter 4 discusses malware types, its detection and recognition techniques and tools. It provides an extensive classification of various malware and virus families, discusses their taxonomy, basic composition, and comparison between them. Beyond pure malware examples, it reviews spam and software vulnerabilities too. Multiple real life cases and examples are provided. Then, it moves to presenting malware detection principles, algorithms and techniques, and anti‐malware tools and technologies. Their examples and use cases are included.
Chapter 5 starts with discussing how hacker’s demography and their culture have been changing over the last years. Then, it proceeds with presenting hacking attacks, techniques, and tools as well as anti‐hacking protection mechanisms. In the second part, it moves to the ordinary user’s profiles and authentication. Here, we show how to employ data science and statistical approaches to find out and analyze user’s characteristics and their influence on the security level of their computer practice. The module presents the computer device security evaluation. It discusses how to conduct analysis, observations, results, and recommendations for users to improve their overall security practices and the security of their devices. Also, it examines the hacking web fingerprinting attacks against the privacy protection TOR technology that utilizes machine learning as well as possible protection mechanisms. Examples and use cases are included.
Module 6 introduces novel adversarial machine learning attacks and their taxonomy when machine learning is used against AI‐based classifiers to make them fail. It investigates a possible data corruption and quality decrease influence on the classifier performance. The module proposes data restoration procedures and other measures to protect against adversarial attacks. Generative adversarial networks are introduced, and their use is discussed. Multiple algorithm examples and use cases are included.
This section lists standard terms used within the book and where to learn more about them.
Term
Additional term
Definition
Definition source
Book section to learn more
Example
Offense
Attack
Any kind of malicious activity that attempts to collect, disrupt, deny, degrade, or destroy information system resources or the information itself.
NIST SP 800‐12;
1.4
Cyber attack
An attack, via cyberspace, targeting an enterprise’s use of cyberspace for the purpose of disrupting, disabling, destroying, or maliciously controlling a computing environment/infrastructure; or destroying the integrity of the data or stealing controlled information.
NIST SP 800‐30 Rev. 1
5.1.5
Advanced persistent threat (APT)
An adversary with sophisticated levels of expertise and significant resources, allowing it through the use of multiple different attack vectors (e.g. cyber, physical, and deception) to generate opportunities to achieve its objectives, which are typically to establish and extend footholds within the information technology infrastructure of organizations for purposes of continually exfiltrating information and/or to undermine or impede critical aspects of a mission, program, or organization, or place itself in a position to do so in the future; moreover, the advanced persistent threat pursues its objectives repeatedly over an extended period of time, adapting to a defender’s efforts to resist it, and with determination to maintain the level of interaction needed to execute its objectives.
NIST SP 800‐39
1,6
Adversarial machine learning (AML)
AML is concerned with the design of ML algorithms that can resist security challenges, the study of the capabilities of attackers, and the understanding of attack consequences.
NISTIR 8269 (DRAFT)
6
Attack signature
A specific sequence of events indicative of an unauthorized access attempt.
NIST SP 800‐12 Rev. 1;
4.5
Brute force
A method of accessing an obstructed device by attempting multiple combinations of numeric/alphanumeric passwords.
NIST 800‐101
5.1.5.2
Colluded applications
Attack performed by two or more cooperating applications, when an application that individually incorporates only harmless permissions expends them by sending and receiving requests to a collaborating application.
5.1.8
Denial of Service
The prevention of authorized access to resources or the delaying of time‐critical operations. (Time‐critical may be milliseconds or it may be hours, depending upon the service provided.)
NIST 800‐12
5.1.5.2
Ex. 5.4
Eavesdropping
An attack in which an attacker listens passively to the authentication protocol to capture information that can be used in a subsequent active attack to masquerade as the claimant.
NIST 800‐63‐3
5.1.5.2
Impersonation
A scenario where the attacker impersonates the verifier in an authentication protocol, usually to capture information that can be used to masquerade as a claimant to the real verifier.
NIST 800‐63‐2
5.1.5.2
Phishing
Fraudulent attempt to obtain sensitive information or data by impersonating oneself as a trustworthy entity in a digital communication.
5.1.5.2
Ex. 5.3
Spoofing
Faking the sending address of a transmission to gain illegal entry into a secure system.
CNSSI 4009‐2015
5.1.5.2.
Ex. 5.7
Website fingerprinting
Attack that allows an adversary to learn information about a user's web browsing activity by recognizing patterns in his traffic.
5.4.2
Ex. 5.8
Zero day
An attack that exploits a previously unknown hardware, firmware, or software vulnerability.
CNSSI 4009‐2015
5.1.5.3
Cyber crime
Criminal activities carried out by means of computers or the Internet.
1.1
Ex. 5.2
Hacker
Unauthorized user who attempts to or gains access to an information system.
NIST SP 800‐12
5.1
Ex. 5.1, 5.2
Malware
Hardware, firmware, or software that is intentionally included or inserted in a system for a harmful purpose.
NIST SP 800‐12
4
Adware
Software that automatically displays or downloads advertising material (often unwanted) when a user is online.
4.2.6
Ex.4.11
Botnet
Attack conducted with the help of more traditional malware types, such as worms and Trojans.
4.2.9
Ex.4.15, 4.16,
Ransomware
Type of malware, which prevents users from accessing their system functionality or data, either by locking the system's screen or by locking the users' files unless a ransom is paid.
1.3, 4.2.7
Ex. 1.3, 1.4, 4.12, 4.13
Rootkit
A set of tools used by an attacker after gaining root‐level access to a host to conceal the attacker’s activities on the host and permit the attacker to maintain root‐level access to the host through covert means.
NIST SP 800‐150
4.2.8
Ex. 4.14
Spyware
Software that is secretly or surreptitiously installed into a system to gather information on individuals or organizations without their knowledge; a type of malicious code.
NIST SP 800‐12 1.3
4.2.5
Ex.4.10
Trojan horse
A computer program that appears to have a useful function, but also has a hidden and potentially malicious function that evades security mechanisms, sometimes by exploiting legitimate authorizations of a system entity that invokes the program.
NIST SP 800‐12
4.2.4
Ex.4.9
Virus
A computer program that can copy itself and infect a computer without permission or knowledge of the user. A virus might corrupt or delete data on a computer, use email programs to spread itself to other computers, or even erase everything on a hard disk.
NIST 800‐12
4.2.2
Ex. 4.3, 4.4, 4.5, 4.6
Worm
A computer program that can run independently, can propagate a complete working version of itself onto other hosts on a network, and may consume computer resources destructively.
NIST 800‐82
4.2.3
Ex.4.1, 4.2, 4.7, 4.8
Risk
The risk to organizational operations (including mission, functions, image, reputation), organizational assets, individuals, other organizations, and the Nation due to the potential for unauthorized access, use, disclosure, disruption, modification, or destruction of information and/or a system.
NIST 800‐12
1.3
Spam
Electronic junk mail or the abuse of electronic messaging systems to indiscriminately send unsolicited bulk messages.
NIST 800‐12
4.3
Threat
Any circumstance or event with the potential to adversely impact organizational operations (including mission, functions, image, or reputation), organizational assets, individuals, other organizations, or the Nation through a system via unauthorized access, destruction, disclosure, modification of information, and/or denial of service.
NIST 800‐12
1.2
Destruction
The process of overwriting, erasing, or physically destroying information (e.g. a cryptographic key) so that it cannot be recovered.
NIST 800‐88
Disclosure
Divulging of, or provision of access to, data.
NISTIR 8053
Unauthorized access
A person gains logical or physical access without permission to a network, system, application, data, or other resource.
NIST 800‐82
1.3
Ex. 5.2
Vulnerability
Weakness in an information system, system security procedures, internal controls, or implementation that could be exploited or triggered by a threat source.
NIST 800‐53
4.4.
Ex. 4.17, 4.18
Defense
Computer security or Cybersecurity
The ability to protect or defend the use of cyberspace from cyberattacks.
NISTIR 8170 under Cybersecurity CNSSI 4009
1.1
Computer security policy
Security policies define the objectives and constraints for the security program. Policies are created at several levels, ranging from organization or corporate policy to specific operational constraints (e.g. remote access). In general, policies provide answers to the questions “what” and “why” without dealing with “how.” Policies are normally stated in terms that are technology‐independent.
NIST 800‐82
1.1
Confidentiality
Preserving authorized restrictions on information access and disclosure, including means for protecting personal privacy and proprietary information.
NIST 800‐53
1.2
Integrity
Guarding against improper information modification or destruction, and includes ensuring information non‐repudiation and authenticity.
NIST 800‐53
1.2
Availability
Ensuring timely and reliable access to and use of information.
NIST 800‐53
1.2
Firewall
A
device
or
program
that controls the flow of network traffic between networks or hosts that employ differing security postures.
NIST SP 800‐41 Rev. 1
2
Application proxy
A firewall capability that combines lower‐layer access control with upper layer‐functionality, and includes a proxy agent that acts as an intermediary between two hosts that wish to communicate with each other.
NIST 800‐41
2.2, 2.3
Demilitarized zone (DMZ)
An interface on a routing firewall that is similar to the interfaces found on the firewall’s protected side. Traffic moving between the DMZ and other interfaces on the protected side of the firewall still goes through the firewall and can have firewall protection policies applied.
NIST 800‐41
2.1, 2.3
Network address translation(NAT)
A routing technology used by many firewalls to hide internal system addresses from an external network through use of an addressing schema.
NIST 800‐41
2.2
Packet filter
A routing device that provides access control functionality for host addresses and communication sessions.
NIST 800‐41
2.2, 2.3
Stateful inspection
Packet filtering that also tracks the state of connections and blocks packets that deviate from the expected state.
NIST 800‐41
2.2, 2.3
Virtual Private Network (VPN)
Protected information system link utilizing tunneling, security controls, and endpoint address translation giving the impression of a dedicated line.
NIST 800‐53
2.1, 2.3
Intrusion detection system (IDS)
A security service that monitors and analyzes network or system events for the purpose of finding, and providing real‐time or near real‐time warning of, attempts to access system resources in an unauthorized manner.
NIST 800‐82
3
Intrusion protection system (IPS)
A system that can detect an intrusive activity and can also attempt to stop the activity, ideally before it reaches its targets.
NIST 800‐82
Rule set
A collection of rules or signatures that network traffic or system activity is compared against to determine an action to take –such as forwarding or rejecting a packet, creating an alert, or allowing a system event.
NIST 800‐115
Ex. 3.1.
False negative or Missing attack
Incorrectly classifying malicious activity as benign.
NIST 800‐83
3.5
False positive or False alarm
Incorrectly classifying benign activity as malicious.
3.5
User authentication
Verifying the identity of a user, process, or device, often as a prerequisite to allowing access to resources in an information system.
NIST 800‐53
5.3
Techniques and Technologies
Internet
The single interconnected worldwide system of commercial, government, educational, and other computer networks that share the set of protocols specified by the Internet Architecture Board (IAB) and the name and address spaces managed by the Internet Corporation for Assigned Names and Numbers (ICANN).
NIST SP 800‐82 Rev. 2 RFC 4949
Algorithm
Formulae given to a computer in order for it to complete a task (i.e. a set of rules for a computer).
Conventional Techniques
String pattern search
Aho–Corasick
Dictionary‐matching algorithm that locates elements of a finite set of strings (the “dictionary”) within an input text and attempts to match all strings simultaneously.
Ex. 4.20.
Boyer and Moore
An efficient string‐searching algorithm that is the standard benchmark for practical string‐search literature.
Alg 3.4
Knuth, Pratt, and Morris
Algorithm, which checks the characters from left to right, and when a pattern has a sub‐pattern that appears more than one in the sub‐pattern, it uses that property to improve the time complexity.
Alg 3.2
Naïve (brute force)
Very general problem‐solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem's statement.
Alg 3.1
Rabin and Karp
A string‐searching algorithm that uses hashing to find patterns in strings.
Alg 3.3.
AI, ML, and Data Science Artificialintelligence(AI)
Interdisciplinary field, usually regarded as a branch of computer science, dealing with models and systems for the performance of functions generally associated with human intelligence, such as reasoning and learning.
1.5.2.
Fuzzy logic
Form of logic, which is much closer to human thinking logic and a natural language than traditional binary logic.
1.5.6, 5.2.4
Expert systems
Intelligent computer program that uses knowledge and inference procedures to solve problems that are difficult enough to require significant human expertise for their solution.
1.5.5, 2.5, 5.2.4
Knowledge‐based
A knowledge‐based system is a computer program that reasons and uses a knowledge base to solve complex problems.
1.5.5, 2.5
Ex. 3.1
Artificial neural networks
A computing system, made up of a number of simple, highly interconnected processing elements, which processes information by its dynamic state response to external inputs.
1.5.8, 3.6.5
Autoencoders
The model that aims to reconstruct data from the input layer into the output layer with a minimal amount of distortion.
Backpropagation
Shorthand for “backward propagation of errors,” is a method of training ANN where the system’s
initial
output is compared to the
desired
output, then adjusted until the difference (between outputs) becomes minimal.
1.5.8
Convolutional
Multilayer topology with a few hidden layers, where each neuron receives its input only from a subset of neurons of the previous layer.
1.5.8, 5.4.2, 6.5.2
Ex.5.8
Deep belief
Composition of
Restricted Boltzmann Machines
(RBM), a class of neural networks with no output layer.
1.5.8
Generative adversarial networks (GAN)
Unsupervised learning technique that is capable to generate data with selected properties similar to a dataset of our choice.
6.5
Long Short Term Memory
Special type of recurrent topology, which has memory cells that maintain information in memory for a longer period.
5.1.8.3
Multilayer perceptron (MLP)
An ANN model, in which neurons compose a layer and layers are connected between each other creating an ANN with certain connectivity organization rules to follow up.
3.6.5.3, 4.6.3
Ex 4.22
Modified time‐based multilayer perceptron (MTBMLP)
ANN topology that consists of multiple time‐based MLPs, all connected to a single‐end MLP, with time series used as inputs.
3.6.5.3, 4.6.3
Ex. 4.22
Radial basis function (RBF)
ANN that uses radial basis functions as activation functions, producing an output, which is a linear combination of radial basis functions of the inputs and neuron parameters.
3.6.5.3
Recurrent
A multilayer topology, which includes the feedback loop that connects its output to the inputs.
1.5.8, 5.1.8.3
Data science
The field that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data.
1.5.3.
Machine learning
A subfield of AI that comprises the study of algorithms which have a capability to improve themselves automatically through experience to solve problems without external instructions, by using previously trained models.
1.5.3.
Intelligent agents
Autonomous entity, which acts, directing its activity toward achieving goals, upon an environment using observation through sensors and consequent actuators.
3.6.5.4
Deep learning
Machine learning method based on characterization of data learning.
1.5.3
Reinforcement learning
Algorithms, in which an agent decides what to do to perform the given task to maximize the given function.
1.5.7
Shallow learning
Techniques that separate the process of feature extraction from learning itself.
3.6.5.1
Supervised learning
Algorithms, which develop a mathematical model from the input data and known desired outputs.
1.5.7
Alg. 1.1.
Unsupervised learning
Algorithms, which take a set of data consisting only of inputs and then they attempt to cluster the data objects based on the similarities or dissimilarities in them.
1.5.7.
Alg. 1.2.
Decision tree
Tree‐structure resembling a flowchart, where every node represents a test to an attribute, each branch represents the possible outcomes of that test, and the leaves represent the class labels.
J48
Open source Java implementation of the C4.5 algorithm that builds decision trees from a set of training data using the concept of information entropy.
6.6.4
Genetic/evolutionary algorithms
Set of evolutionary algorithms, which take an inspiration from genetic evolution theories.
3.6.4, 3.6.5.4
Alg. 1.3
Hidden Markov models
Algorithm that builds up a set of states producing outputs with different probabilities with the goal to find out the sequence of states that results in the observed outputs.
K‐means
Clustering algorithm that uses a distance function to distribute all data pieces between k clusters defined by their centroid position in the feature space.
3.6.2
K‐nearest neighbor
Classification algorithm that uses a distance function in order to determine to which class to assign the new element by finding K closest elements in the feature space.
3.6.3, 5.3.5.4
Naive Bayes
Algorithm that consists of applying the Bayes theorem in order to find a distribution of conditional probabilities among class labels, with the assumption of independence between features.
Random forest
An ensemble learning method that builds a large group of independent decision trees, and outputs the mode of the label predictions of all the trees.
6.6.4
Sec.6.6.4
Support vector machine
Binary classification algorithm that creates a hyper plane that separates the data into two classes with the objective to maximize the gap perpendicular to the plane, allowing better generalization.
Please note: I realize that there exist various definitions and even understandings of these terms’ meaning. I have chosen to follow up the definitions given in the publications of the NIST Computer Security Resource Center (see https://csrc.nist.gov/glossary), first (see Section I.6) and then proceed with others (see Section I.7). Even those publications are ambiguous in some cases and provide different meanings too. I have chosen ones, which are followed up in this book. I do not intend to make this list all inclusive or exclusive.
NIST SP 800‐12 An Introduction to Information Security, June 2017, available free of charge from: https://doi.org/10.6028/NIST.SP.800‐12r1
NIST SP 800‐30 Guide for Conducting Risk Assessments NIST, Sep. 2012, available at https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800‐30r1.pdf
NIST SP 800‐39 Managing Information Security Risk, March 2011, available at https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800‐39.pdf
NIST SP 800‐41 Rev. 1 Guidelines on Firewalls and Firewall Policy NIST, September 2009, available at https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800‐41r1.pdf
NIST SP 800‐53 Rev. 5 CNSSI 4009 Security and Privacy Controls for Information Systems and Organizations, September 2020, available at doi.org/10.6028/NIST.SP.800‐53r5
NIST 800‐63 Digital Identity Guidelines, June 2017, available at https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800‐63‐3.pdf
NIST SP 800‐82 Rev. 2 RFC 4949, Guide to Industrial Control Systems (ICS) Security, May 2015, available from: http://dx.doi.org/10.6028/NIST.SP.800‐82r2
NIST 800‐83 Revision 1 Guide to Malware Incident Prevention and Handling for Desktops and Laptops, July 2013, available at https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800‐83r1.pdf
NIST 800‐88, Revision 1: Guidelines for Media Sanitization, 5 February 2015, available at https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800‐88r1.pdf
NIST Special Publication 800‐101 Guidelines on Mobile Device Forensics, May 2014, available at http://dx.doi.org/10.6028/NIST.SP. 800‐101r1
NIST Special Publication 800‐115 Technical Guide to Information Security Testing and Assessment, Sep. 2008, available at https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800‐115.pdf
NIST Special Publication 800‐150 Guide to Cyber Threat Information Sharing, October 2016, available at https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800‐150.pdf
NISTIR 8053 De‐Identification of Personal Information, October 2015, available at https://nvlpubs.nist.gov/nistpubs/ir/2015/NIST.IR.8053.pdf
NISTIR 8170 Approaches for Federal Agencies to Use the Cybersecurity Framework
NIST, March 2020, available at https://doi.org/10.6028/NIST.IR.8170
NISTIR 8269 (DRAFT) A taxonomy and terminology of adversarial machine learning, May 2019, available at https://nvlpubs.nist.gov/nistpubs/ir/2019/NIST.IR.8269‐draft.pdf
National Institute of Standards and Technology (NIST) provides a keyword searchable glossary of more than 6700 security‐related terms with references to a particular NIST publication. This Glossary consists of terms and definitions extracted verbatim from NIST's cybersecurity‐ and privacy‐related Federal Information Processing Standards (FIPS), NIST Special Publications (SPs), and NIST Internal/Interagency Reports (IRs), as well as from Committee on National Security Systems (CNSS) Instruction CNSSI‐4009 – see
https://csrc.nist.gov/glossary/
The National Initiative for Cybersecurity Careers and Studies of the Department of Homeland Security Portal provides cybersecurity lexicon to serve the cybersecurity communities of practice and interest for both the public and private sectors. It complements other lexicons such as the NISTIR 7298 Glossary of Key Information Security Terms. Objectives for lexicon are to enable clearer communication and common understanding of cybersecurity terms, through use of plain English and annotations on the definitions. The lexicon will evolve through ongoing feedback from end users and stakeholders – see
https://niccs.cisa.gov/about‐niccs/cybersecurity‐glossary#
SANS Institute glossary of terms – see
https://www.sans.org/security‐resources/glossary‐of‐terms/
Canadian Centre for Cyber Security’s glossary – see
https://cyber.gc.ca/en/glossary
Council of Europe, Artificial Intelligence Glossary – see
https://www.coe.int/en/web/artificial‐intelligence/glossary
Wikipedia, Glossary of Artificial Intelligence – see
https://en.wikipedia.org/wiki/Glossary_of_artificial_intelligence
Wikipedia https://en.wikipedia.org/wiki/Comparison_of_antivirus_software
Anti‐malware Reviews at http://www.antimalwarereviews.com,
Common Vulnerabilities and Exposures (CVE)® is a list of records – each containing an identification number, a description, and at least one public reference – for publicly known cybersecurity vulnerabilities – https://cve.mitre.org and CVE Details (https://www.cvedetails.com)
Comparison of computer viruses, Wikipedia, accessed at https://en.wikipedia.org/wiki/Comparison_of_computer_viruses – contains a unified list (currently a few dozen virus families) of computer viruses with their origin, isolation dates, and short descriptions
Computer worms, Wikipedia, accessed at https://en.wikipedia.org/wiki/List_of_computer_worms – contains a unified list (currently a few dozen) of computer worms with their origin, isolation dates, and short descriptions
Clam antivirus signature database, www.clamav.net.
Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data science goals. Their website provides a transparent repository for public datasets – https://www.kaggle.com/datasets
Spam datasets: LingSpam (csmining.org/index.php/ling‐spam‐datasets.html) and SpamAssasin (http://spamassassin.org/publiccorpus).
Computer and network security, also called cybersecurity, is one of the most significant subjects to consider when dealing with computers, networking, and data issues. As data and digital technology gains an inclusion into everyone's life in general, their security becomes more important too. While more than two billion people are estimated to use the Internet on a regular basis at a present time, the amount of sensitive and private data collected and stored by government and nongovernment organizations that needs to get protected grows up every day. On the other hand, computer systems and communication networks have always been vulnerable to a myriad of threats that can inflict different types of damage resulting in significant losses. The damage can include anything from a data entry error resulting in violation of data integrity to a planted virus that could destroy an entire database with a possible damage source ranging from an outside hacker to an inside mistrusted employee or just a human error.
A new professional community of computer security specialists who are responsible for protecting the systems against adversary attacks and preventing the damage has been formed. Cybersecurity significantly changes the protection landscape and the range of the defense mechanisms. Nowadays, attackers have a wide selection of devices that they could target and infect through local and global networks such as an entire computer infrastructure, mobile phones, and even computerized automobiles. With each passing year, attacks seem to be escalating, also these threats are evolving as attackers invent newer ways to steal, harm, and destroy. Criminals aim at stealing private as well as financial information, and even government and political entities are not immune. Hence, there exist a rising need to safeguard sensitive information and computer resources from these complex and malicious threats. Hacking has been around for decades but present‐day crimes become often not only financially but also politically or personally motivated. The hacker's activities like identity theft and cyber terrorism create a clear danger to private citizens as well as threats the whole society fabric. The possible hacking sponsorship provided by some government and nongovernment organizations around the world make their ulterior motives and goals even more frightening.
Computer security is a forever evolving field, especially if one takes into account the current technological and societal developments. Two major interrelated trends are observed in a modern technological development: computerization or digitization and interconnection through the Internet (Figure 1.1). Over the last couple of decades, the number of devices connected to the Internet has grown at a large scale. New computerized products, which are released nowadays have an Internet connection capability, and the rate at which they utilize it is high. These devices generate and collect an astronomical amount of data, which needs to be stored, processed, communicated, and accessed (Figure 1.2). However, with time, more and more complex exploits get built.
Figure 1.1 Security threats get close to you through networking.
The computer security has always been a quintessential aspect when it comes to technology. The security has grown stronger in comparison to how it looked a few decades ago but so have the threats. New kinds of threats and attackers have been coming up and testing the resistance of the computer and information systems since then (Figure 1.3). Now, with the wide number of many electronic gadgets such as mobiles and any data storage devices, and also with an immense adaptation of computer technology in all kinds of diversified sectors, the need for the sound security control mechanisms and tools grows faster than ever before too. Initially, attackers targeted the large‐scale organizations in financial sectors but it has changed. Nowadays, they are coming after smaller organizations too. Attackers are looking for any form of assets to steal or modify, they can get an access to.
Big data and pervasive computing are the new areas, which cyberattackers set up to exploit. Mobile phones, tablets, and laptops compose the areas, which are most vulnerable. Immense research is carried out in these fields to make them secure. Not long time ago, the picture looked different. There were limited number of devices and the Internet was not so widespread. Cloud computing was a concept only. Not all the data were stored in the cloud. The viruses were not smart enough to exploit the available vulnerabilities. Nor had the hackers tools so powerful, that people with limited or no knowledge of computing could actually get into systems of other owners. Figure 1.3 demonstrates the growth in attacks diversity and sophistication and at the same time the decrease in the knowledge possession required for conducting cyberattacks due to the availability of automated tools.
Figure 1.2 New technologies and applications, such as self‐driving cars and bikes dramatically increase the data security and privacy protection requirements.
Source: Courtesy of @ L. Reznik and I. Khokhlov.
The major reasons that are commonly given to explain the growing cost of cybercrime are:
quick adoption of new technologies by cyber criminals,
the increased number of new users online, who nowadays mostly come from low‐income communities and countries with weak cybersecurity education and implemented protection mechanisms,
an increased ease of committing cybercrime, with the growth of cyber‐crime‐as‐a‐service development and criminals attempting to monetize their breach success,
an expanding number of cybercrime centers around the world that might get supported by their government agencies,
a growing financial sophistication among top‐tier cyber criminals and availability of cryptocurrencies that, among other things, make crime monetization easier.
The recent Report to the President (2018) on Enhancing the Resilience of the Internet and Communications Ecosystem Against Botnets and Other Automated, Distributed Threats released by the US Department of Commerce and Department of Homeland Security on 22 May 2018 clustered the opportunities and challenges in working toward dramatically reducing threats from automated, distributed attacks in six principal themes and determined possible organizational measures to address them.
Automated, distributed attacks present a global problem.
The majority of the compromised devices in recent noteworthy botnets have been geographically located outside the United States. To increase the resilience of the Internet and communications ecosystem against these threats, we must continue to work closely with international partners.
Effective tools exist, but are not widely used.
While there remains room for improvement, the tools, processes, and practices required to significantly enhance the resilience of the Internet and communications ecosystem are widely available, and are routinely applied in selected market sectors. However, they do not form the common practices for product development and deployment in many other sectors for a variety of reasons, including (but not limited to) lack of awareness, cost avoidance, insufficient technical expertise, and lack of market incentives.
Figure 1.3 Attack sophistication vs. Intruder knowledge.
Source: Courtesy of @ L. Reznik and I. Khokhlov. Modified from https://www.eleceng.adelaide.edu.au/students/wiki/projects/images/c/cd/Fig2.png.
Products should be secured during all stages of the life cycle.
