E-Book
35,99 €

Hands-On Graph Neural Networks Using Python E-Book

Maxime Labonne

0,0

35,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Graph neural networks are a highly effective tool for analyzing data that can be represented as a graph, such as networks, chemical compounds, or transportation networks. The past few years have seen an explosion in the use of graph neural networks, with their application ranging from natural language processing and computer vision to recommendation systems and drug discovery.
Hands-On Graph Neural Networks Using Python begins with the fundamentals of graph theory and shows you how to create graph datasets from tabular data. As you advance, you’ll explore major graph neural network architectures and learn essential concepts such as graph convolution, self-attention, link prediction, and heterogeneous graphs. Finally, the book proposes applications to solve real-life problems, enabling you to build a professional portfolio. The code is readily available online and can be easily adapted to other datasets and apps.
By the end of this book, you’ll have learned to create graph datasets, implement graph neural networks using Python and PyTorch Geometric, and apply them to solve real-world problems, along with building and training graph neural network models for node and graph classification, link prediction, and much more.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 366

Veröffentlichungsjahr: 2023

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Hands-On Graph Neural Networks Using Python

Practical techniques and architectures for building powerful graph and deep learning apps with PyTorch

Maxime Labonne

BIRMINGHAM—MUMBAI

Hands-On Graph Neural Networks Using Python

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Gebin George

Publishing Product Manager: Dinesh Chaudhary

Senior Editor: David Sugarman

Technical Editor: Devanshi Ayare

Copy Editor: Safis Editing

Project Coordinator: Farheen Fathima

Proofreader: Safis Editing

Indexer: Tejal Daruwale Soni

Production Designer: Joshua Misquitta

First published: April 2023

Production reference: 1240323

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80461-752-6

www.packtpub.com

Contributors

About the author

Maxime Labonne is a senior applied researcher at J.P. Morgan with a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his Ph.D., Maxime worked on developing machine learning algorithms for anomaly detection in computer networks. He then joined the AI Connectivity Lab at Airbus, where he applied his expertise in machine learning to improve the security and performance of computer networks. He then joined J.P. Morgan, where he now develops techniques for solving a variety of challenging problems in finance and other domains. In addition to his research work, Maxime is passionate about sharing his knowledge and experience with others through Twitter (@maximelabonne) and his personal blog.

About the reviewers

Dr. Mürsel Taşgın is a computer scientist with a Ph.D. He graduated from the Computer Engineering Department of Middle East Technical University in 2002. He completed his master of science and Ph.D. in the Computer Engineering Department of Bogazici University. During his Ph.D., he worked in the field of complex systems, graphs, and ML. He also worked in industry in technical, research, and managerial roles (at Mostly.AI, KKB, Turkcell, and Akbank). Dr. Mürsel Taşgın’s current focus is mainly on generative AI, graph machine learning, and financial applications of machine learning. He also teaches artificial intelligence (AI)/ML courses at universities.

I would like to thank my dear wife Zehra and precious son Kerem for their support and understanding during my long working hours.

Amir Shirian is a data scientist at Nokia, where he applies his expertise in multimodal signal processing and ML to solve complex problems. He received his Ph.D. in computer science from the University of Warwick, England, after completing his bachelor of science and master of science degrees in electrical engineering at the University of Tehran, Iran. Amir’s research focuses on developing algorithms and models for emotion and behavior understanding, with a particular interest in using graph neural networks to analyze and interpret data from multiple sources. His work has been published in several high-profile academic journals and presented at international conferences. Amir enjoys hiking, playing 3tar, and exploring new technologies in his free time.

Lorenzo Giusti is a Ph.D. student in data science at La Sapienza, University of Rome, with a focus on extending graph neural networks through topological deep learning. He has extensive research experience as a visiting Ph.D. student at Cambridge, as a research scientist intern at NASA, where he supervised a team and led a project on synthesizing the Martian environment using images from spacecraft cameras, and as a research scientist intern at CERN, working on anomaly detection for particle physics accelerators. Lorenzo also has a master of science in data science from La Sapienza and a bachelor of engineering in computer engineering from Roma Tre University, where he focused on quantum technologies.

Preface

Part 1: Introduction to Graph Learning

1 Getting Started with Graph Learning

Why graphs?

Why graph learning?

Why graph neural networks?

Summary

2 Graph Theory for Graph Neural Networks

Technical requirements

Introducing graph properties

Directed graphs

Weighted graphs

Connected graphs

Types of graphs

Discovering graph concepts

Fundamental objects

Graph measures

Adjacency matrix representation

Exploring graph algorithms

Breadth-first search

Depth-first search

Summary

3 Creating Node Representations with DeepWalk

Technical requirements

Introducing Word2Vec

CBOW versus skip-gram

Creating skip-grams

The skip-gram model

DeepWalk and random walks

Implementing DeepWalk

Summary

Part 2: Fundamentals

4 Improving Embeddings with Biased Random Walks in Node2Vec

Technical requirements

Introducing Node2Vec

Defining a neighborhood

Introducing biases in random walks

Implementing Node2Vec

Building a movie RecSys

Summary

5 Including Node Features with Vanilla Neural Networks

Technical requirements

Introducing graph datasets

The Cora dataset

The Facebook Page-Page dataset

Classifying nodes with vanilla neural networks

Classifying nodes with vanilla graph neural networks

Summary

6 Introducing Graph Convolutional Networks

Technical requirements

Designing the graph convolutional layer

Comparing graph convolutional and graph linear layers

Predicting web traffic with node regression

Summary

7 Graph Attention Networks

Technical requirements

Introducing the graph attention layer

Linear transformation

Activation function

Softmax normalization

Multi-head attention

Improved graph attention layer

Implementing the graph attention layer in NumPy

Implementing a GAT in PyTorch Geometric

Summary

Part 3: Advanced Techniques

8 Scaling Up Graph Neural Networks with GraphSAGE

Technical requirements

Introducing GraphSAGE

Neighbor sampling

Aggregation

Classifying nodes on PubMed

Inductive learning on protein-protein interactions

Summary

9 Defining Expressiveness for Graph Classification

Technical requirements

Defining expressiveness

Introducing the GIN

Classifying graphs using GIN

Graph classification

Implementing the GIN

Summary

10 Predicting Links with Graph Neural Networks

Technical requirements

Predicting links with traditional methods

Heuristic techniques

Matrix factorization

Predicting links with node embeddings

Introducing Graph Autoencoders

Introducing VGAEs

Implementing a VGAE

Predicting links with SEAL

Introducing the SEAL framework

Implementing the SEAL framework

Summary

11 Generating Graphs Using Graph Neural Networks

Technical requirements

Generating graphs with traditional techniques

The Erdős–Rényi model

The small-world model

Generating graphs with graph neural networks

Graph variational autoencoders

Autoregressive models

Generative adversarial networks

Generating molecules with MolGAN

Summary

12 Learning from Heterogeneous Graphs

Technical requirements

The message passing neural network framework

Introducing heterogeneous graphs

Transforming homogeneous GNNs to heterogeneous GNNs

Implementing a hierarchical self-attention network

Summary

13 Temporal Graph Neural Networks

Technical requirements

Introducing dynamic graphs

Forecasting web traffic

Introducing EvolveGCN

Implementing EvolveGCN

Predicting cases of COVID-19

Introducing MPNN-LSTM

Implementing MPNN-LSTM

Summary

14 Explaining Graph Neural Networks

Technical requirements

Introducing explanation techniques

Explaining GNNs with GNNExplainer

Introducing GNNExplainer

Implementing GNNExplainer

Explaining GNNs with Captum

Introducing Captum and integrated gradients

Implementing integrated gradients

Summary

Part 4: Applications

15 Forecasting Traffic Using A3T-GCN

Technical requirements

Exploring the PeMS-M dataset

Processing the dataset

Implementing the A3T-GCN architecture

Summary

16 Detecting Anomalies Using Heterogeneous GNNs

Technical requirements

Exploring the CIDDS-001 dataset

Preprocessing the CIDDS-001 dataset

Implementing a heterogeneous GNN

Summary

17 Building a Recommender System Using LightGCN

Technical requirements

Exploring the Book-Crossing dataset

Preprocessing the Book-Crossing dataset

Implementing the LightGCN architecture

Summary

18 Unlocking the Potential of Graph Neural Networks for Real-World Applications

Index

Other Books You May Enjoy

Part 1: Introduction to Graph Learning

In recent years, graph representation of data has become increasingly prevalent across various domains, from social networks to molecular biology. It is crucial to have a deep understanding of Graph Neural Networks (GNNs), which are designed specifically to handle graph-structured data, to unlock the full potential of this representation.

This first part consists of two chapters and serves as a solid foundation for the rest of the book. It introduces the concepts of graph learning and GNNs and their relevance in numerous tasks and industries. It also covers the fundamental concepts of graph theory and its applications in graph learning, such as graph centrality measures. This part also highlights the unique features and performance of the GNN architecture compared to other methods.

By the end of this part, you will have a solid understanding of the importance of GNNs in solving many real-world problems. You will be acquainted with the essentials of graph learning and how it is used in various domains. Furthermore, you will have a comprehensive overview of the main concepts of graph theory that we will use in later chapters. With this solid foundation, you will be well equipped to move on to the more advanced concepts in graph learning and GNNs in the following parts of the book.

This part comprises the following chapters:

Chapter 1, Getting Started with Graph LearningChapter 2, Graph Theory for Graph Neural Networks

1 Getting Started with Graph Learning

Welcome to the first chapter of our journey into the world of graph neural networks (GNNs). In this chapter, we will delve into the foundations of GNNs and understand why they are crucial tools in modern data analysis and machine learning. To that end, we will answer three essential questions that will provide us with a comprehensive understanding of GNNs.

First, we will explore the significance of graphs as a representation of data, and why they are widely used in various domains such as computer science, biology, and finance. Next, we will delve into the importance of graph learning, where we will understand the different applications of graph learning and the different families of graph learning techniques. Finally, we will focus on the GNN family, highlighting its unique features, performance, and how it stands out compared to other methods.

By the end of this chapter, you will have a clear understanding of why GNNs are important and how they can be used to solve real-world problems. You will also be equipped with the knowledge and skills you need to dive deeper into more advanced topics. So, let’s get started!

In this chapter, we will cover the following main topics:

Why graphs?Why graph learning?Why graph neural networks?

Why graphs?

The first question we need to address is: why are we interested in graphs in the first place? Graph theory, the mathematical study of graphs, has emerged as a fundamental tool for understanding complex systems and relationships. A graph is a visual representation of a collection of nodes (also called vertices) and edges that connect these nodes, providing a structure to represent entities and their relationships (see Figure 1.1).

Figure 1.1 – Example of a graph with six nodes and five edges

By representing a complex system as a network of entities with interactions, we can analyze their relationships, allowing us to gain a deeper understanding of their underlying structures and patterns. The versatility of graphs makes them a popular choice in various domains, including the following:

Computer science, where graphs can be used to model the structure of computer programs, making it easier to understand how different components of a system interact with each otherPhysics, where graphs can be used to model physical systems and their interactions, such as the relationship between particles and their propertiesBiology, where graphs can be used to model biological systems, such as metabolic pathways, as a network of interconnected entitiesSocial sciences, where graphs can be used to study and understand complex social networks, including the relationships between individuals in a communityFinance, where graphs can be used to analyze stock market trends and relationships between different financial instrumentsEngineering, where graphs can be used to model and analyze complex systems, such as transportation networks and electrical power grids

These domains naturally exhibit a relational structure. For instance, graphs are a natural representation of social networks: nodes are users, and edges represent friendships. But graphs are so versatile they can also be applied to domains where the relational structure is less natural, unlocking new insights and understanding.

For example, images can be represented as a graph, as in Figure 1.2. Each pixel is a node, and edges represent relationships between neighboring pixels. This allows for the application of graph-based algorithms to image processing and computer vision tasks.

Figure 1.2 – Left: original image; right: graph representation of this image

Similarly, a sentence can be transformed into a graph, where nodes are words and edges represent relationships between adjacent words. This approach is useful in natural language processing and information retrieval tasks, where the context and meaning of words are critical factors.

Unlike text and images, graphs do not have a fixed structure. However, this flexibility also makes graphs more challenging to handle. The absence of a fixed structure means they can have an arbitrary number of nodes and edges, with no specific ordering. In addition, graphs can represent dynamic data, where the connections between entities can change over time. For example, the relationships between users and products can change as they interact with each other. In this scenario, nodes and edges are updated to reflect changes in the real world, such as new users, new products, andnew relationships.

In the next section, we will delve deeper into how to use graphs with machine learning to create valuable applications.

Why graph learning?

Graph learning is the application of machine learning techniques to graph data. This study area encompasses a range of tasks aimed at understanding and manipulating graph-structured data. There are many graphs learning tasks, including the following:

Node classification is a task that involves predicting the category (class) of a node in a graph. For example, it can categorize online users or items based on their characteristics. In this task, the model is trained on a set of labeled nodes and their attributes, and it uses this information to predict the class of unlabeled nodes.Link prediction is a task that involves predicting missing links between pairs of nodes in a graph. This is useful in knowledge graph completion, where the goal is to complete a graph of entities and their relationships. For example, it can be used to predict the relationships between people based on their social network connections (friend recommendation).Graph classification is a task that involves categorizing different graphs into predefined categories. One example of this is in molecular biology, where molecular structures can be represented as graphs, and the goal is to predict their properties for drug design. In this task, the model is trained on a set of labeled graphs and their attributes, and it uses this information to categorize unseen graphs.Graph generation is a task that involves generating new graphs based on a set of desired properties. One of the main applications is generating novel molecular structures for drug discovery. This is achieved by training a model on a set of existing molecular structures and then using it to generate new, unseen structures. The generated structures can be evaluated for their potential as drug candidates and further studied.

Graph learning has many other practical applications that can have a significant impact. One of the most well-known applications is recommender systems, where graph learning algorithms recommend relevant items to users based on their previous interactions and relationships with other items. Another important application is traffic forecasting, where graph learning can improve travel time predictions by considering the complex relationships between different routes and modes of transportation.

The versatility and potential of graph learning make it an exciting field of research and development. The study of graphs has advanced rapidly in recent years, driven by the availability of large datasets, powerful computing resources, and advancements in machine learning and artificial intelligence. As a result, we can list four prominent families of graph learning techniques [1]:

Graph signal processing, which applies traditional signal processing methods to graphs, such as the graph Fourier transform and spectral analysis. These techniques reveal the intrinsic properties of the graph, such as its connectivity and structure.Matrix factorization, which seeks to find low-dimensional representations of large matrices. The goal of matrix factorization is to identify latent factors or patterns that explain the observed relationships in the original matrix. This approach can provide a compact and interpretable representation of the data.Random walk, which refers to a mathematical concept used to model the movement of entities in a graph. By simulating random walks over a graph, information about the relationships between nodes can be gathered. This is why they are often used to generate training data for machine learning models.Deep learning, which is a subfield of machine learning that focuses on neural networks with multiple layers. Deep learning methods can effectively encode and represent graph data as vectors. These vectors can then be used in various tasks with remarkable performance.

It is important to note that these techniques are not mutually exclusive and often overlap in their applications. In practice, they are often combined to form hybrid models that leverage the strengths of each. For example, matrix factorization and deep learning techniques might be used in combination to learn low-dimensional representations of graph-structured data.

As we delve into the world of graph learning, it is crucial to understand the fundamental building block of any machine learning technique: the dataset. Traditional tabular datasets, such as spreadsheets, represent data as rows and columns with each row representing a single data point. However, in many real-world scenarios, the relationships between data points are just as meaningful as the data points themselves. This is where graph datasets come in. Graph datasets represent data points as nodes in a graph and the relationships between those data points as edges.

Let’s take the tabular dataset shown in Figure 1.3 as an example.

Figure 1.3 – Family tree as a tabular dataset versus a graph dataset

This dataset represents information about five members of a family. Each member has three features (or attributes): name, age, and gender. However, the tabular version of this dataset doesn’t show the connections between these people. On the contrary, the graph version represents them with edges, which allows us to understand the relationships in this family. In many contexts, the connections between nodes are crucial in understanding the data, which is why representing data in graph form is becoming increasingly popular.

Now that we have a basic understanding of graph machine learning and the different types of tasks it involves, we can move on to exploring one of the most important approaches for solving these tasks: graph neural networks.

Why graph neural networks?

In this book, we will focus on the deep learning family of graph learning techniques, often referred to as graph neural networks. GNNs are a new category of deep learning architecture and are specifically designed for graph-structured data. Unlike traditional deep learning algorithms, which have been primarily developed for text and images, GNNs are explicitly made to process and analyze graph datasets (see Figure 1.4).

Figure 1.4 – High-level architecture of a GNN pipeline, with a graph as input and an output that corresponds to a given task

GNNs have emerged as a powerful tool for graph learning and have shown excellent results in various tasks and industries. One of the most striking examples is how a GNN model identified a new antibiotic [2]. The model was trained on 2,500 molecules and was tested on a library of 6,000 compounds. It predicted that a molecule called halicin should be able to kill many antibiotic-resistant bacteria while having low toxicity to human cells. Based on this prediction, the researchers used halicin to treat mice infected with antibiotic-resistant bacteria. They demonstrated its effectiveness and believe the model could be used to design new drugs.

How do GNNs work? Let’s take the example of a node classification task in a social network, like the previous family tree (Figure 1.3). In a node classification task, GNNs take advantage of information from different sources to create a vector representation of each node in the graph. This representation encompasses not only the original node features (such as name, age, and gender) but also information from edge features (such as the strength of relationships between nodes) and global features (such as network-wide statistics).

This is why GNNs are more efficient than traditional machine learning techniques on graphs. Instead of being limited to the original attributes, GNNs enrich the original node features with attributes from neighboring nodes, edges, and global features, making the representation much more comprehensive and meaningful. The new node representations are then used to perform a specific task, such as node classification, regression, or link prediction.

Specifically, GNNs define a graph convolution operation that aggregates information from the neighboring nodes and edges to update the node representation. This operation is performed iteratively, allowing the model to learn more complex relationships between nodes as the number of iterations increases. For example, Figure 1.5 shows how a GNN would calculate the representation of node 5 using neighboring nodes.

Figure 1.5 – Left: input graph; right: computation graph representing how a GNN computes the representation of node 5 based on its neighbors

It is worth noting that Figure 1.5 provides a simplified illustration of a computation graph. In reality, there are various kinds of GNNs and GNN layers, each of which has a unique structure and way of aggregating information from neighboring nodes. These different variants of GNNs also have their own advantages and limitations and are well-suited for specific types of graph data and tasks. When selecting the appropriate GNN architecture for a particular problem, it is crucial to understand the characteristics of the graph data and the desired outcome.

More generally, GNNs, like other deep learning techniques, are most effective when applied to specific problems. These problems are characterized by high complexity, meaning that learning good representations is critical to solving the task at hand. For example, a highly complex task could be recommending the right products among billions of options to millions of customers. On the other hand, some problems, such as finding the youngest member of our family tree, can be solved without any machine learning technique.

Furthermore, GNNs require a substantial amount of data to perform effectively. Traditional machine learning techniques might be a better fit in cases where the dataset is small, as they are less reliant on large amounts of data. However, these techniques do not scale as well as GNNs. GNNs can process bigger datasets thanks to parallel and distributed training. They can also exploit the additional information more efficiently, which produces better results.

Summary

In this chapter, we answered three main questions: why graphs, why graph learning, and why graph neural networks? First, we explored the versatility of graphs in representing various data types, such as social networks and transportation networks, but also text and images. We discussed the different applications of graph learning, including node classification and graph classification, and highlighted the four main families of graph learning techniques. Finally, we emphasized the significance of GNNs and their superiority over other techniques, especially regarding large, complex datasets. By answering these three main questions, we aimed to provide a comprehensive overview of the importance of GNNs and why they are becoming vital tools in machine learning.

In Chapter 2, Graph Theory for Graph Neural Networks, we will dive deeper into the basics of graph theory, which provides the foundation for understanding GNNs. This chapter will cover the fundamental concepts of graph theory, including concepts such as adjacency matrices and degrees. Additionally, we will delve into the different types of graphs and their applications, such as directed and undirected graphs, and weighted and unweighted graphs.

2 Graph Theory for Graph Neural Networks

Graph theoryis a fundamental branch of mathematics that deals with the study of graphs and networks. A graph is a visual representation of complex data structures that helps us understand the relationships between different entities. Graph theory provides us with tools to model and analyze a vast array of real-world problems, such as transportation systems, social networks, and internet connectivity.

In this chapter, we will delve into the essentials of graph theory, covering three main topics: graph properties, graph concepts, and graph algorithms. We will begin by defining graphs and their components. We will then introduce the different types of graphs and explain their properties and applications. Next, we will cover fundamental graph concepts, objects, and measures, including the adjacency matrix. Finally, we will dive into graph algorithms, focusing on the two fundamental algorithms, breadth-first search (BFS) and depth-first search (DFS).

By the end of this chapter, you will have a solid foundation in graph theory, allowing you to tackle more advanced topics and design graph neural networks.

In this chapter, we will cover the following main topics:

Introducing graph propertiesDiscovering graph conceptsExploring graph algorithms

Technical requirements

All the code examples from this chapter can be found on GitHub at https://github.com/PacktPublishing/Hands-On-Graph-Neural-Networks-Using-Python/tree/main/Chapter02.

The installation steps required to run the code on your local machine can be found in the Preface of this book.

Introducing graph properties

In graph theory, a graph is a mathematical structure consisting of a set of objects, called vertices or nodes, and a set of connections, called edges, which link pairs of vertices. The notation is used to represent a graph, where is the graph, is the set of vertices, and is the set of edges.

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben:

Hands-On Graph Neural Networks Using Python E-Book

Maxime Labonne

Hands-On Graph Neural Networks Using Python

Hands-On Graph Neural Networks Using Python

Contributors

About the author

About the reviewers

Table of Contents

Preface

Part 1: Introduction to Graph Learning

1

Getting Started with Graph Learning

Why graphs?

Why graph learning?

Why graph neural networks?

Summary

Further reading

2

Graph Theory for Graph Neural Networks

Technical requirements

Introducing graph properties

Directed graphs

Weighted graphs

Connected graphs

Types of graphs

Discovering graph concepts

Fundamental objects

Graph measures

Adjacency matrix representation

Exploring graph algorithms

Breadth-first search

Depth-first search

Summary

3

Creating Node Representations with DeepWalk

Technical requirements

Introducing Word2Vec

CBOW versus skip-gram

Creating skip-grams

The skip-gram model

DeepWalk and random walks

Implementing DeepWalk

Summary

Further reading

Part 2: Fundamentals

4

Improving Embeddings with Biased Random Walks in Node2Vec

Technical requirements

Introducing Node2Vec

Defining a neighborhood

Introducing biases in random walks

Implementing Node2Vec

Building a movie RecSys

Summary

Further reading

5

Including Node Features with Vanilla Neural Networks

Technical requirements

Introducing graph datasets

The Cora dataset

The Facebook Page-Page dataset

Classifying nodes with vanilla neural networks

Classifying nodes with vanilla graph neural networks

Summary

Further reading

6

Introducing Graph Convolutional Networks

Technical requirements

Designing the graph convolutional layer

Comparing graph convolutional and graph linear layers

Predicting web traffic with node regression

Summary

Further reading

7

Graph Attention Networks

Technical requirements

Introducing the graph attention layer

Linear transformation

Activation function

Softmax normalization