E-Book
39,40 €

Hands-On Network Programming with C E-Book

Lewis Van Winkle

0,0

39,40 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

A comprehensive guide to programming with network sockets, implementing internet protocols, designing IoT devices, and much more with C

Key Features

Apply your C and C++ programming skills to build powerful network applications
Get to grips with a variety of network protocols that allow you to load web pages, send emails, and do much more
Write portable network code for Windows, Linux, and macOS

Book Description

Network programming enables processes to communicate with each other over a computer network, but it is a complex task that requires programming with multiple libraries and protocols. With its support for third-party libraries and structured documentation, C is an ideal language to write network programs.

Complete with step-by-step explanations of essential concepts and practical examples, this C network programming book begins with the fundamentals of Internet Protocol, TCP, and UDP. You'll explore client-server and peer-to-peer models for information sharing and connectivity with remote computers. The book will also cover HTTP and HTTPS for communicating between your browser and website, and delve into hostname resolution with DNS, which is crucial to the functioning of the modern web. As you advance, you'll gain insights into asynchronous socket programming and streams, and explore debugging and error handling. Finally, you'll study network monitoring and implement security best practices.

By the end of this book, you'll have experience of working with client-server applications and be able to implement new network programs in C.

The code in this book is compatible with the older C99 version as well as the latest C18 and C++17 standards. You'll work with robust, reliable, and secure code that is portable across operating systems, including Winsock sockets for Windows and POSIX sockets for Linux and macOS.

What you will learn

Uncover cross-platform socket programming APIs
Implement techniques for supporting IPv4 and IPv6
Understand how TCP and UDP connections work over IP
Discover how hostname resolution and DNS work
Interface with web APIs using HTTP and HTTPS
Explore Simple Mail Transfer Protocol (SMTP) for electronic mail transmission
Apply network programming to the Internet of Things (IoT)

Who this book is for

If you're a developer or a system administrator who wants to get started with network programming, this book is for you. Basic knowledge of C programming is assumed.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 518

Veröffentlichungsjahr: 2019

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Hands-On Network Programming with C

Learn socket programming in C and write secure and optimized network code

Lewis Van Winkle

BIRMINGHAM - MUMBAI

Hands-On Network Programming with C

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor:Richa TripathiAcquisition Editor:Shriram ShekharContent Development Editor:Digvijay BagulTechnical Editor:Abin SebastianCopy Editor: Safis EditingProject Coordinator:Prajakta NaikProofreader: Safis EditingIndexer:Tejal Daruwale SoniGraphics Coordinator:Tom ScariaProduction Coordinator:Aparna Bhagat

First published: May 2019

Production reference: 1100519

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78934-986-3

www.packtpub.com

For Doogie

– Lewis Van Winkle

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the author

Lewis Van Winkle is a software programming consultant, entrepreneur, and founder of a successful IoT company. He has over 20 years of programming experience after publishing his first successful software product at the age of 12. He has over 15 years of programming experience with the C programming language on a variety of operating systems and platforms. He is active in the open source community and has published several popular open source programs and libraries—many of them in C. Today, Lewis spends much of his time consulting, where he loves taking on difficult projects that other programmers have given up on. He specializes in network systems, financial systems, machine learning, and interoperation between different programming languages.

I would like to thank the publisher, Packt. This book wouldn't exist without their encouragement and backing. I would also like to extend a special thank you to my reviewer, Daniele Lacamera, for the careful work he carried out. This book improved significantly as a result of his valuable feedback. I also want to acknowledge the patience and support that my friends and family have shown over the last year while I've been away writing.

About the reviewer

Daniele Lacamera is a software technologist and researcher with vast experience in software design and development on embedded systems for different industries. He is currently working as freelance software developer and trainer. He is a worldwide expert in TCP/IP and transport protocol design and optimization, with more than 20 academic publications on the topic. He supports free software by contributing to several projects, including the Linux kernel, and is involved within a number of communities and organizations that promote the use of free and open source software in the IoT.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Title Page

Hands-On Network Programming with C

Dedication

About Packt

Why subscribe?

Packt.com

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1 - Getting Started with Network Programming

Introducing Networks and Protocols

Technical requirements

The internet and C

OSI layer model

TCP/IP layer model

Data encapsulation

Internet Protocol

What is an address?

Domain names

Internet routing

Local networks and address translation

Subnetting and CIDR

Multicast, broadcast, and anycast

Port numbers

Clients and servers

Putting it together

What's your address?

Listing network adapters from C

Listing network adapters on Windows

Listing network adapters on Linux and macOS

Summary

Questions

Getting to Grips with Socket APIs

Technical requirements

What are sockets?

Socket setup

Two types of sockets

Socket functions

Anatomy of a socket program

TCP program flow

UDP program flow

Berkeley sockets versus Winsock sockets

Header files

Socket data type

Invalid sockets

Closing sockets

Error handling

Our first program

A motivating example

Making it networked

Working with IPv6

Supporting both IPv4 and IPv6

Networking with inetd

Summary

Questions

An In-Depth Overview of TCP Connections

Technical requirements

Multiplexing TCP connections

Polling non-blocking sockets

Forking and multithreading

The select() function

Synchronous multiplexing with select()

select() timeout

Iterating through an fd_set

select() on non-sockets

A TCP client

TCP client code

A TCP server

TCP server code

Building a chat room

Blocking on send()

TCP is a stream protocol

Summary

Questions

Establishing UDP Connections

Technical requirements

How UDP sockets differ

UDP client methods

UDP server methods

A first UDP client/server

A simple UDP server

A simple UDP client

A UDP server

Summary

Questions

Hostname Resolution and DNS

Technical requirements

How hostname resolution works

DNS record types

DNS security

Name/address translation functions

Using getaddrinfo()

Using getnameinfo()

Alternative functions

IP lookup example program

The DNS protocol

DNS message format

DNS message header format

Question format

Answer format

Endianness

A simple DNS query

A DNS query program

Printing a DNS message name

Printing a DNS message

Sending the query

Summary

Questions

Further reading

Section 2 - An Overview of Application Layer Protocols

Building a Simple Web Client

Technical requirements

The HTTP protocol

HTTP request types

HTTP request format

HTTP response format

HTTP response codes

Response body length

What's in a URL

Parsing a URL

Implementing a web client

HTTP POST requests

Encoding form data

File uploads

Summary

Questions

Further reading

Building a Simple Web Server

Technical requirements

The HTTP server

The server architecture

Content types

Returning Content-Type from a filename

Creating the server socket

Multiple connections buffering

get_client()

drop_client()

get_client_address()

wait_on_clients()

send_400()

send_404()

serve_resource()

The main loop

Security and robustness

Open source servers

Summary

Questions

Further reading

Making Your Program Send Email

Technical requirements

Email servers

SMTP security

Finding an email server

SMTP dialog

The format of an email

A simple SMTP client program

Enhanced emails

Email file attachments

Spam-blocking pitfalls

Summary

Questions

Further reading

Section 3 - Understanding Encrypted Protocols and OpenSSL

Loading Secure Web Pages with HTTPS and OpenSSL

Technical requirements

HTTPS overview

Encryption basics

Symmetric ciphers

Asymmetric ciphers

How TLS uses ciphers

The TLS protocol

Certificates

Server name identification

OpenSSL

Encrypted sockets with OpenSSL

Certificates

A simple HTTPS client

Other examples

Summary

Questions

Further reading

Implementing a Secure Web Server

Technical requirements

HTTPS and OpenSSL summary

Certificates

Self-signed certificates with OpenSSL

HTTPS server with OpenSSL

Time server example

A full HTTPS server

HTTPS server challenges

OpenSSL alternatives

Alternatives to TLS

Summary

Questions

Further reading

Establishing SSH Connections with libssh

Technical requirements

The SSH protocol

libssh

Testing out libssh

Establishing a connection

SSH authentication

Server authentication

Client authentication

Executing a remote command

Downloading a file

Summary

Questions

Further reading

Section 4 - Odds and Ends

Network Monitoring and Security

Technical requirements

The purpose of network monitoring

Testing reachability

Checking a route

How traceroute works

Raw sockets

Checking local connections

Snooping on connections

Deep packet inspection

Capturing all network traffic

Network security

Application security and safety

Network-testing etiquette

Summary

Questions

Further reading

Socket Programming Tips and Pitfalls

Technical requirements

Error handling

Obtaining error descriptions

TCP socket tips

Timeout on connect()

TCP flow control and avoiding deadlock

Congestion control

The Nagle algorithm

Delayed acknowledgment

Connection tear-down

The shutdown() function

Preventing address-in-use errors

Sending to a disconnected peer

Socket's local address

Multiplexing with a large number of sockets

Summary

Questions

Web Programming for the Internet of Things

Technical requirements

What is the IoT?

Connectivity options

Wi-Fi

Ethernet

Cellular

Bluetooth

IEEE 802.15.4 WPANs

Hardware choices

Single-board computers

Microcontrollers

FPGAs

External transceivers and modems

IoT protocols

Firmware updates

Ethics of IoT

Privacy and data collection

End-of-life planning

Security

Summary

Questions

Answers to Questions

Chapter 1, Introducing Networks and Protocols

Chapter 2, Getting to Grips with Socket APIs

Chapter 3, An In-Depth Overview of TCP Connections

Chapter 4, Establishing UDP Connections

Chapter 5, Hostname Resolution and DNS

Chapter 6, Building a Simple Web Client

Chapter 7, Building a Simple Web Server

Chapter 8, Making Your Program Send Email

Chapter 9, Loading Secure Web Pages with HTTPS and OpenSSL

Chapter 10, Implementing a Secure Web Server

Chapter 11, Establishing SSH Connections with libssh

Chapter 12, Network Monitoring and Security

Chapter 13, Socket Programming Tips and Pitfalls

Chapter 14, Web Programming for the Internet of Things

Setting Up Your C Compiler on Windows

Installing MinGW GCC

Installing Git

Installing OpenSSL

Installing libssh

Alternatives

Setting Up Your C Compiler on Linux

Installing GCC

Installing Git

Installing OpenSSL

Installing libssh

Setting Up Your C Compiler on macOS

Installing Homebrew and the C compiler

Installing OpenSSL

Installing libssh

Example Programs

Code license

Code included with this book

Chapter 1 – Introducing Networks and Protocols

Chapter 2 – Getting to Grips with Socket APIs

Chapter 3 – An In-Depth Overview of TCP Connections

Chapter 4 – Establishing UDP Connections

Chapter 5 – Hostname Resolution and DNS

Chapter 6 – Building a Simple Web Client

Chapter 7 – Building a Simple Web Server

Chapter 8 – Making Your Program Send Email

Chapter 9 – Loading Secure Web Pages with HTTPS and OpenSSL

Chapter 10 – Implementing a Secure Web Server

Chapter 11 – Establishing SSH Connections with libssh

Chapter 12 – Network Monitoring and Security

Chapter 13 – Socket Programming Tips and Pitfalls

Chapter 14 – Web Programming for the Internet of Things

Other Book You May Enjoy

Leave a review - let other readers know what you think

Preface

Packt first contacted me about writing this book nearly a year ago. It's been a long journey, harder than I anticipated at times, and I've learned a lot. The book you hold now is the culmination of many long days, and I'm proud to finally present it.

I think C is a beautiful programming language. No other language in everyday use gets you as close to the machine as C does. I've used C to program 8-bit microcontrollers with only 16 bytes of RAM, just the same as I've used it to program modern desktops with multi-core, multi-GHz processors. It's truly remarkable that C works efficiently in both contexts.

Network programming is a fun topic, but it's also a very deep one; a lot is going on at many levels. Some programming languages hide these abstractions. In the Python programming language, for example, you can download an entire web page using only one line of code. This isn't the case in C! In C, if you want to download a web page, you have to know how everything works. You need to know sockets, you need to know Transfer Control Protocol (TCP), and you need to know HTTP. In C network programming, nothing is hidden.

C is a great language to learn network programming in. This is not only because we get to see all the details, but also because the popular operating systems all use kernels written in C. No other language gives you the same first-class access as C does. In C, everything is under your control – you can lay out your data structures exactly how you want, manage memory precisely as you please, and even shoot yourself in the foot just the way you want.

When I first began writing this book, I surveyed other resources related to learning network programming with C. I found much misinformation – not only on the web, but even in print. There is a lot of C networking code that is done wrong. Internet tutorials about C sockets often use deprecated functions and ignore memory safety completely. When it comes to network programming, you can't take the it works so it's good enough programming-by-coincidence approach. You have to use reasoning.

In this book, I take care to approach network programming in a modern and safe way. The example programs are carefully designed to work with both IPv4 and IPv6, and they are all written in a portable, operating system-independent way, whenever possible. Wherever there is an opportunity for memory errors, I try to take notice and point out these concerns. Security is too often left as an afterthought. I believe security is important, and it should be planned in the system from the beginning. Therefore, in addition to teaching network basics, this book spends a lot of time working with secure protocols, such as TLS.

I hope you enjoy reading this book as much as I enjoyed writing it.

Who this book is for

This book is for the C or C++ programmer who wants to add networking features to their software. It is also designed for the student or professional who simply wants to learn about network programming and common network protocols.

It is assumed that the reader already has some familiarity with the C programming language. This includes a basic proficiency with pointers, basic data structures, and manual memory management.

What this book covers

Chapter 1, Introducing Networks and Protocols, introduces the important concepts related to networking. This chapter includes example programs to determine your IP address pragmatically.

Chapter 2, Getting to Grips with Socket APIs, introduces socket programming APIs and has you build your first networked program—a tiny web server.

Chapter 3, An In-Depth Overview of TCP Connections, focuses on programming TCP sockets. In this chapter, example programs are developed for both the client and server sides.

Chapter 4, Establishing UDP Connections, covers programming with User Datagram Protocol (UDP) sockets.

Chapter 5, Hostname Resolution and DNS, explains how hostnames are translated into IP addresses. In this chapter, we build an example program to perform manual DNS query lookups using UDP.

Chapter 6, Building a Simple Web Client, introduces HTTP—the protocol that powers websites. We dive right in and build an HTTP client in C.

Chapter 7, Building a Simple Web Server, describes how to construct a fully functional web server in C. This program is able to serve a static website to any modern web browser.

Chapter 8, Making Your Program Send Email, describes Simple Mail Transfer Protocol (SMTP)—the protocol that is powering email. In this chapter, we develop a program that can send email over the internet.

Chapter 9, Loading Secure Web Pages with HTTPS and OpenSSL, explores TLS—the protocol that secures web pages. In this chapter, we develop an HTTPS client that is capable of downloading web pages securely.

Chapter 10, Implementing a Secure Web Server, continues on the security theme and explores the construction of a secure HTTPS web server.

Chapter 11, Establishing SSH Connections with libssh, continues with the secure protocol theme. The use of Secure Shell (SSH) is covered to connect to a remote server, execute commands, and download files securely.

Chapter 12, Network Monitoring and Security, discusses the tools and techniques used to test network functionality, troubleshoot problems, and eavesdrop on insecure communication protocols.

Chapter 13, Socket Programming Tips and Pitfalls, goes into detail about TCP and addresses many important edge cases that appear in socket programming. The techniques covered are invaluable for creating robust network programs.

Chapter 14, Web Programming for the Internet of Things, gives an overview of the design and programming for Internet of Things (IoT) applications.

Appendix A, Answers to Questions, provides answers to the comprehension questions given at the end of each chapter.

Appendix B, Setting Up Your C Compiler on Windows, gives instructions for setting up a development environment on Windows that is needed for compiling all of the example programs in this book.

Appendix C, Setting Up Your C Compiler on Linux, provides the setup instructions for preparing your Linux computer to be capable of compiling all of the example programs in this book.

Appendix D, Setting Up Your C Compiler on macOS, gives step-by-step instructions for configuring your macOS system to be capable of compiling all of the example programs in this book.

Appendix E, Example Programs, lists each example program, by chapter, included in this book's code repository.

To get the most out of this book

The reader is expected to be proficient in the C programming language. This includes a familiarity with memory management, the use of pointers, and basic data structures.

A Windows, Linux, or macOS development machine is recommended; you can refer to the appendices for setup instructions.

This book takes a hands-on approach to learning and includes 44 example programs. Working through these examples as you read the book will help enforce the concepts.

The code for this book is released under the MIT open source license. The reader is encouraged to use, modify, improve, and even publish their changes to these example programs.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

The code bundle for the book is also publicly hosted on GitHub at https://github.com/codeplea/hands-on-network-programming-with-c. In case there's an update to the code, it will be updated on that GitHub repository. Each chapter that introduces example programs begins with the commands needed to download the book's code.

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789349863_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, variable names, function names, directory names, filenames, file extensions, pathnames, URLs, and user input. Here is an example: "Use the select() function to wait for network data."

A block of code is set as follows:

/* example program */#include <stdio.h>int main() { printf("Hello World!\n"); return 0;}

Any command-line input or output is written as follows:

gcc hello.c -o hello

./hello

Bold: Indicates a new term, an important word, or words that you see on screen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1 - Getting Started with Network Programming

This section will get the reader up and running with the basics of networking, the relevant network protocols, and basic socket programming.

The following chapters are in this section:

Chapter 1, An Introduction to Networks and Protocols

Chapter 2, Getting to Grips with Socket APIs

Chapter 3, An In-Depth Overview of TCP Connections

Chapter 4, Establishing UDP Connections

Chapter 5, Hostname Resolution and DNS

Introducing Networks and Protocols

In this chapter, we will review the fundamentals of computer networking. We'll look at abstract models that attempt to explain the main concerns of networking, and we'll explain the operation of the primary network protocol, the Internet Protocol. We'll look at address families and end with writing programs to list your computer's local IP addresses.

The following topics are covered in this chapter:

Network programming and C

OSI layer model

TCP/IP reference model

The Internet Protocol

IPv4 addresses and IPv6 addresses

Domain names

Internet protocol routing

Network address translation

The client-server paradigm

Listing your IP addresses programmatically from C

Technical requirements

Most of this chapter focuses on theory and concepts. However, we do introduce some sample programs near the end. To compile these programs, you will need a good C compiler. We recommend MinGW on Windows and GCC on Linux and macOS. See Appendix B, Setting Up Your C Compiler On Windows, Appendix C, Setting Up Your C Compiler On Linux, and Appendix D, Setting Up Your C Compiler On macOS, for compiler setup.

The code for this book can be found at: https://github.com/codeplea/Hands-On-Network-Programming-with-C.

From the command line, you can download the code for this chapter with the following command:

git clone https://github.com/codeplea/Hands-On-Network-Programming-with-C

cd Hands-On-Network-Programming-with-C/chap01

On Windows, using MinGW, you can use the following command to compile and run code:

gcc win_list.c -o win_list.exe -liphlpapi -lws2_32

win_list

On Linux and macOS, you can use the following command:

gcc unix_list.c -o unix_list

./unix_list

The internet and C

Today, the internet needs no introduction. Certainly, millions of desktops, laptops, routers, and servers are connected to the internet and have been for decades. However, billions of additional devices are now connected as well—mobile phones, tablets, gaming systems, vehicles, refrigerators, television sets, industrial machinery, surveillance systems, doorbells, and even light bulbs. The new Internet of Things (IoT) trend has people rushing to connect even more unlikely devices every day.

Over 20 billion devices are estimated to be connected to the internet now. These devices use a wide variety of hardware. They connect over an Ethernet connection, Wi-Fi, cellular, a phone line, fiber optics, and other media, but they likely have one thing in common; they likely use C.

The use of the C programming language is ubiquitous. Almost every network stack is programmed in C. This is true for Windows, Linux, and macOS. If your mobile phone uses Android or iOS, then even though the apps for these were programmed in a different language (Java and Objective C), the kernel and networking code was written in C. It is very likely that the network routers that your internet data goes through are programmed in C. Even if the user interface and higher-level functions of your modem or router are programmed in another language, the networking drivers are still probably implemented in C.

Networking encompasses concerns at many different abstraction levels. The concerns your web browser has with formatting a web page are much different than the concerns your router has with forwarding network packets. For this reason, it is useful to have a theoretical model that helps us to understand communications at these different levels of abstraction. Let's look at these models now.

OSI layer model

It's clear that if all of the disparate devices composing the internet are going to communicate seamlessly, there must be agreed-upon standards that define their communications. These standards are called protocols. Protocols define everything from the voltage levels on an Ethernet cable to how a JPEG image is compressed on a web page. It's clear that, when we talk about the voltage on an Ethernet cable, we are at a much different level of abstraction compared to talking about the JPEG image format. If you're programming a website, you don't want to think about Ethernet cables or Wi-Fi frequencies. Likewise, if you're programming an internet router, you don't want to have to worry about how JPEG images are compressed. For this reason, we break the problem down into many smaller pieces.

One common method of breaking down the problem is to place levels of concern into layers. Each layer then provides services for the layer on top of it, and each upper layer can rely on the layers underneath it without concern for how they work.

The most popular layer system for networking is called the Open Systems Interconnection model (OSI model). It was standardized in 1977 and is published as ISO 7498. It has seven layers:

Let's understand these layers one by one:

Physical

(1): This is the level of physical communication in the real world. At this level, we have specifications for things such as the voltage levels on an Ethernet cable, what each pin on a connector is for, the radio frequency of Wi-Fi, and the light flashes over an optic fiber.

Data Link

(2): This level builds on the physical layer. It deals with protocols for directly communicating between two nodes. It defines how a direct message between nodes starts and ends (framing), error detection and correction, and flow control.

Network layer

(3): The network layer provides the methods to transmit data sequences (called packets) between nodes in different networks. It provides methods to route packets from one node to another (without a direct physical connection) by transferring through many intermediate nodes. This is the layer that the Internet Protocol is defined on, which we will go into in some depth later.

Transport layer

(4): At this layer, we have methods to reliably deliver variable length data between hosts. These methods deal with splitting up data, recombining it, ensuring data arrives in order, and so on. The

Transmission Control Protocol

(

TCP

) and

User Datagram Protocol

(

UDP

) are commonly said to exist on this layer.

Session layer

(5): This layer builds on the transport layer by adding methods to establish, checkpoint, suspend, resume, and terminate dialogs.

Presentation layer

(6): This is the lowest layer at which data structure and presentation for an application are defined. Concerns such as data encoding, serialization, and encryption are handled here.

Application layer

(7): The applications that the user interfaces with (for example, web browsers and email clients) exist here. These applications make use of the services provided by the six lower layers.

In the OSI model, an application, such as a web browser, exists in the application layer (layer 7). A protocol from this layer, such as HTTP used to transmit web pages, doesn't have to concern itself with how the data is being transmitted. It can rely on services provided by the layer underneath it to effectively transmit data. This is illustrated in the following diagram:

It should be noted that chunks of data are often referred to by different names depending on the OSI layer they're on. A data unit on layer 2 is called a frame, since layer 2 is responsible for framing messages. A data unit on layer 3 is referred to as a packet, while a data unit on layer 4 is a segment if it is part of a TCP connection or a datagram if it is a UDP message.

In this book, we often use the term packet as a generic term to refer to a data unit on any layer. However, segment will only be used in the context of a TCP connection, and datagram will only refer to UDP datagrams.

As we will see in the next section, the OSI model doesn't fit precisely with the common protocols in use today. However, it is still a handy model to explain networking concerns, and it is still in widespread use for that purpose today.

TCP/IP layer model

The TCP/IP protocol suite is the most common network communication model in use today. The TCP/IP reference model differs a bit from the OSI model, as it has only four layers instead of seven.

The following diagram illustrates how the four layers of the TCP/IP model line up to the seven layers of the OSI model:

Notably, the TCP/IP model doesn't match up exactly with the layers in the OSI model. That's OK. In both models, the same functions are performed; they are just divided differently.

The TCP/IP reference model was developed after the TCP/IP protocol was already in common use. It differs from the OSI model by subscribing a less rigid, although still hierarchical, model. For this reason, the OSI model is sometimes better for understanding and reasoning about networking concerns, but the TCP/IP model reflects a more realistic view of how networking is commonly implemented today.

The four layers of the TCP/IP model are as follows:

Network Access layer

(1): On this layer, physical connections and data framing happen. Sending an Ethernet or Wi-Fi packet are examples of layer 1 concerns.

Internet layer

(2): This layer deals with the concerns of addressing packets and routing them over multiple interconnection networks. It's at this layer that an IP address is defined.

Host-to-Host layer

(3): The host-to-host layer provides two protocols, TCP and UDP, which we will discuss in the next few chapters. These protocols address concerns such as data order, data segmentation, network congestion, and error correction.

Process/Application layer

(4): The process/application layer is where protocols such as HTTP, SMTP, and FTP are implemented. Most of the programs that feature in this book could be considered to take place on this layer while consuming functionality provided by our operating system's implementation of the lower layers.

Regardless of your chosen abstraction model, real-world protocols do work at many levels. Lower levels are responsible for handling data for the higher levels. These lower-level data structures must, therefore, encapsulate data from the higher levels. Let's look at encapsulating data now.

Data encapsulation

The advantage of these abstractions is that, when programming an application, we only need to consider the highest-level protocol. For example, a web browser needs only to implement the protocols dealing specifically with websites—HTTP, HTML, CSS, and so on. It does not need to bother with implementing TCP/IP, and it certainly doesn't have to understand how an Ethernet or Wi-Fi packet is encoded. It can rely on ready-made implementations of the lower layers for these tasks. These implementations are provided by the operating system (for example, Windows, Linux, and macOS). When communicating over a network, data must be processed down through the layers at the sender and up again through the layers at the receiver. For example, if we have a web server, Host A, which is transmitting a web page to the receiver, Host B, it may look like this:

The web page contains a few paragraphs of text, but the web server doesn't only send the text by itself. For the text to be rendered correctly, it must be encoded in an HTML structure:

In some cases, the text is already preformatted into HTML and saved that way but, in this example, we are considering a web application that dynamically generates the HTML, which is the most common paradigm for dynamic web pages. As the text cannot be transmitted directly, neither can the HTML. It instead must be transmitted as part of an HTTP response. The web server does this by applying the appropriate HTTP response header to the HTML:

The HTTP is transmitted as part of a TCP session. This isn't done explicitly by the web server, but is taken care of by the operating system's TCP/IP stack:

The TCP packet is routed by an IP packet:

This is transmitted over the wire in an Ethernet packet (or another protocol):

Luckily for us, the lower-level concerns are handled automatically when we use the socket APIs for network programming. It is still useful to know what happens behind the scenes. Without this knowledge, dealing with failures or optimizing for performance is difficult if not impossible.

With some of the theory out of the way, let's dive into the actual protocols powering modern networking.

Domain names

The Internet Protocol can only route packets to an IP address, not a name. So, if you try to connect to a website, such as example.com, your system must first resolve that domain name, example.com, into an IP address for the server that hosts that website. This is done by connecting to a Domain Name System (DNS) server. You connect to a domain name server by knowing in advance its IP address. The IP address for a domain name server is usually assigned by your ISP.

Many other domain name servers are made publicly available by different organizations. Here are a few free and public DNS servers:

DNS Provider

IPv4 Addresses

IPv6 Addresses

Cloudflare 1.1.1.1

1.1.1.1

2606:4700:4700::1111

1.0.0.1

2606:4700:4700::1001

FreeDNS

37.235.1.174

37.235.1.177

Google Public DNS

8.8.8.8

2001:4860:4860::8888

8.8.4.4

2001:4860:4860::8844

OpenDNS

208.67.222.222

2620:0:ccc::2

208.67.220.220

2620:0:ccd::2

To resolve a hostname, your computer sends a UDP message to your domain name server and asks it for an AAAA-type record for the domain you're trying to resolve. If this record exists, an IPv6 address is returned. You can then connect to a server at that address to load the website. If no AAAA record exists, then your computer queries the server again, but asks for an A record. If this record exists, you will receive an IPv4 address for the server. In many cases, a site will publish an A record and an AAAA record that route to the same server.

It is also possible, and common, for multiple records of the same type to exist, each pointing to a different address. This is useful for redundancy in the case where multiple servers can provide the same service.

We will see a lot more about DNS queries in Chapter 5, Hostname Resolution and DNS.

Now that we have a basic understanding of IP addresses and names, let's look into detail of how IP packets are routed over the internet.

Internet routing

If all networks contained only a maximum of only two devices, then there would be no need for routing. Computer A would just send its data directly over the wire, and computer B would receive it as the only possibility:

The internet today has an estimated 20 billion devices connected. When you make a connection over the internet, your data first transmits to your local router. From there, it is transmitted to another router, which is connected to another router, and so on. Eventually, your data reaches a router that is connected to the receiving device, at which point, the data has reached its destination:

Imagine that each router in the preceding diagram is connected to tens, hundreds, or even thousands of other routers and systems. It's an amazing feat that IP can discover the correct path and deliver traffic seamlessly.

Windows includes a utility, tracert, which lists the routers between your system and the destination system.

Here is an example of using the tracert command on Windows 10 to trace the route to example.com:

As you can see from the example, there are 11 hops between our system and the destination system (example.com, 93.184.216.34). The IP addresses are listed for many of these intermediate routers, but a few are missing with the Request timed out message. This usually means that the system in question doesn't support the part of the Internet Control Message Protocol (ICMP) protocol needed. It's not unusual to see a few such systems when running tracert.

In Unix-based systems, the utility to trace routes is called traceroute. You would use it like traceroute example.com, for example, but the information obtained is essentially the same.

More information on tracert and traceroute can be found in Chapter 12, Network Monitoring and Security.

Sometimes, when IP packets are transferred between networks, their addresses must be translated. This is especially common when using IPv4. Let's look at the mechanism for this next.

Local networks and address translation

It's common for households and organizations to have small Local Area Networks (LANs). As mentioned previously, there are IPv4 addresses ranges reserved for use in these small local networks.

These reserved private ranges are as follows:

10.0.0.0

10.255.255.255

172.16.0.0

172.31.255.255

192.168.0.0

192.168.255.255

When a packet originates from a device on an IPv4 local network, it must undergoNetwork Address Translation(NAT) before being routed on the internet. A router that implements NAT remembers which local address a connection is established from.

The devices on the same LAN can directly address one another by their local address. However, any traffic communicated to the internet must undergo address translation by the router. The router does this by modifying the source IP address from the original private LAN IP address to its public internet IP address:

Likewise, when the router receives the return communication, it must modify the destination address from its public IP to the private IP of the original sender. It knows the private IP address because it was stored in memory after the first outgoing packet:

Network address translation can be more complicated than it first appears. In addition to modifying the source IP address in the packet, it must also update the checksums in the packet. Otherwise, the packet would be detected as containing errors and discarded by the next router. The NAT router must also remember which private IP address sent the packet in order to route the reply. Without remembering the translation address, the NAT router wouldn't know where to send the reply to on the private network.

NATs will also modify the packet data in some cases. For example, in the File Transfer Protocol (FTP), some connection information is sent as part of the packet's data. In these cases, the NAT router will look at the packet's data in order to know how to forward future incoming packets. IPv6 largely avoids the need for NAT, as it is possible (and common) for each device to have its own publicly-addressable address.

You may be wondering how a router knows whether a message is locally deliverable or whether it must be forwarded. This is done using a netmask, subnet mask, or CIDR.

Subnetting and CIDR

IP addresses can be split into parts. The most significant bits are used to identify the network or subnetwork, and the least significant bits are used to identify the specific device on the network.

This is similar to how your home address can be split into parts. Your home address includes a house number, a street name, and a city. The city is analogous to the network part, the street name could be the subnetwork part, and your house number is the device part.

IPv4 traditionally uses a mask notation to identify the IP address parts. For example, consider a router on the 10.0.0.0 network with a subnet mask of 255.255.255.0. This router can take any incoming packet and perform a bitwise AND operation with the subnet mask to determine whether the packet belongs on the local subnet or needs to be forwarded on. For example, this router receives a packet to be delivered to 10.0.0.105. It does a bitwise AND operation on this address with the subnet mask of 255.255.255.0, which produces 10.0.0.0. That matches the subnet of the router, so the traffic is local. If, instead, we consider a packet destined for 10.0.15.22, the result of the bitwise AND with the subnet mask is 10.0.15.0. This address doesn't match the subnet the router is on, and so it must be forwarded.

IPv6 uses CIDR. Networks and subnetworks are specified using the CIDR notation we described earlier. For example, if the IPv6 subnet is /112, then the router knows that any address that matches on the first 112 bits is on the local subnet.

So far, we've covered only routing with one sender and one receiver. While this is the most common situation, let's consider alternative cases too.

Multicast, broadcast, and anycast

When a packet is routed from one sender to one receiver, it uses unicast addressing. This is the simplest and most common type of addressing. All of the protocols we deal with in this book use unicast addressing.

Broadcast addressing allows a single sender to address a packet to all recipients simultaneously. It is typically used to deliver a packet to every receiver on an entire subnet.

If a broadcast is a one-to-all communication, then multicast is a one-to-many communication. Multicast involves some group management, and a message is addressed and delivered to members of a group.

Anycast addressed packets are used to deliver a message to one recipient when you don't care who that recipient is. This is useful if you have several servers that provide the same functionality, and you simply want one of them (you don't care which) to handle your request.

IPv4 and lower network levels support local broadcast addressing. IPv4 provides some optional (but commonly implemented) support for multicasting. IPv6 mandates multicasting support while providing additional features over IPv4's multicasting. Though IPv6 is not considered to broadcast, its multicasting functionality can essentially emulate it.

It's worth noting that these alternative addressing methods don't generally work over the broader internet. Imagine if one peer was able to broadcast a packet to every connected internet device. It would be a mess!

If you can use IP multicasting on your local network, though, it is worthwhile to implement it. Sending one IP level multicast conserves bandwidth compared to sending the same unicast message multiple times.

However, multicasting is often done at the application level. That is, when the application wants to deliver the same message to several recipients, it sends the message multiple times – once to each recipient. In Chapter 3, An In-Depth Overview of TCP Connections, we build a chat room. This chat room could be said to use application-level multicasting, but it does not take advantage of IP multicasting.

We've covered how messages are routed through a network. Now, let's see how a message knows which application is responsible for it once it arrives at a specific system.

Port numbers

An IP address alone isn't quite enough. We need port numbers. To return to the telephone analogy, if IP addresses are phone numbers, then port numbers are like phone extensions.

Generally, an IP address gets a packet routed to a specific system, but a port number is used to route the packet to a specific application on that system.

For example, on your system, you may be running multiple web browsers, an email client, and a video-conferencing client. When your computer receives a TCP segment or UDP datagram, your operating system looks at the destination port number in that packet. That port number is used to look up which application should handle it.

Port numbers are stored as unsigned 16-bit integers. This means that they are between 0 and 65,535 inclusive.

Some port numbers for common protocols are as follows:

Port Number

Protocol

TCP

File Transfer Protocol

(

FTP

)

TCP

Secure Shell

(

SSH

)

Chapter 11

Establishing SSH Connections with libssh

TCP

Telnet

TCP

Simple Mail Transfer Protocol

(

SMTP

)

Chapter 8

Making Your Program Send Email

UDP

Domain Name System

(

DNS

)

Chapter 5

Hostname Resolution and DNS

TCP

Hypertext Transfer Protocol

(

HTTP

)

Chapter 6, Building a Simple Web Client

Chapter 7, Building a Simple Web Server

110

TCP

Post Office Protocol, Version 3

(

POP3

)

143

TCP

Internet Message Access Protocol

(

IMAP

)

194

TCP

Internet Relay Chat

(

IRC

)

443

TCP

HTTP over TLS/SSL

(

HTTPS

)

Chapter 9, Loading Secure Web Pages with HTTPS and OpenSSLChapter 10, Implementing a Secure Web Server

993

TCP

IMAP over TLS/SSL

(

IMAPS

)

995

TCP

POP3 over TLS/SSL

(

POP3S

)

Each of these listed port numbers is assigned by the Internet Assigned Numbers Authority (IANA). They are responsible for the official assignments of port numbers for specific protocols. Unofficial port usage is very common for applications implementing custom protocols. In this case, the application should try to choose a port number that is not in common use to avoid conflict.

Clients and servers

In the telephone analogy, a call must be initiated first by one party. The initiating party dials the number for the receiving party, and the receiving party answers.

This is also a common paradigm in networking called the client-server model. In this model, a server listens for connections. The client, knowing the address and port number that the server is listening on, establishes the connection by sending the first packet.

For example, the web server at example.com listens on port 80 (HTTP) and port 443 (HTTPS). A web browser (client) must establish the connection by sending the first packet to the web server address and port.

Putting it together

A socket is one end-point of a communication link between systems. It's an abstraction in which your application can send and receive data over the network, in much the same way that your application can read and write to a file using a file handle.

An open socket is uniquely defined by a 5-tuple consisting of the following:

Local IP address

Local port

Remote IP address

Remote port

Protocol (UDP or TCP)

This 5-tuple is important, as it is how your operating system knows which application is responsible for any packets received. For example, if you use two web browsers to establish two simultaneous connections to example.com on port 80, then your operating system keeps the connections separate by looking at the local IP address, local port, remote IP address, remote port, and protocol. In this case, the local IP addresses, remote IP addresses, remote port (80), and protocol (TCP) are identical.

The deciding factor then is the local port (also called the ephemeral port), which will have been chosen to be different by the operating system for connection. This 5-tuple is also important to understand how NAT works. A private network may have many systems accessing the same outside resource, and the router NAT must store this five tuple for each connection in order to know how to route received packets back into the private network.

What's your address?

You can find your IP address using the ipconfig command on Windows, or the ifconfig command on Unix-based systems (such as Linux and macOS).

Using the ipconfig command from Windows PowerShell looks like this:

In this example, you can find that the IPv4 address is listed under Ethernet adapter Ethernet0. Your system may have more network adapters, and each will have its own IP address. We can tell that this computer is on a local network because the IP address, 192.168.182.133, is in the private IP address range.

On Unix-based systems, we use either the ifconfig or ip addr commands. The ifconfig command is the old way and is now deprecated on some systems. The ip addr command is the new way, but not all systems support it yet.

Using the ifconfig command from a macOS terminal looks like this:

The IPv4 address is listed next to inet. In this case, we can see that it's 192.168.182.128. Again, we see that this computer is on a local network because of the IP address range. The same adapter has an IPv6 address listed next to inet6.

The following screenshot shows using the ip addr command on Ubuntu Linux:

The preceding screenshot shows the local IPv4 address as 192.168.182.145. We can also see that the link-local IPv6 address is fe80::df60:954e:211:7ff0.

These commands, ifconfig, ip addr, and ipconfig, show the IP address or addresses for each adapter on your computer. You may have several. If you are on a local network, the IP addresses you see will be your local private network IP addresses.

If you are behind a NAT, there is often no good way to know your public IP address. Usually, the only resort is to contact an internet server that provides an API that informs you of your IP address.

A few free and public APIs for this are as follows:

http://api.ipify.org/

http://helloacm.com/api/what-is-my-ip-address/

http://icanhazip.com/

http://ifconfig.me/ip

You can test out these APIs in a web browser:

Each of these listed web pages should return your public IP address and not much else. These sites are useful for when you need to determine your public IP address from behind an NAT programmatically. We look at writing a small HTTP client capable of downloading these web pages and others in Chapter 6, Building a Simple Web Client.

Now that we've seen the built-in utilities for determining our local IP addresses, let's next look at how to accomplish this from C.

Listing network adapters from C

Sometimes, it is useful for your C programs to know what your local address is. For most of this book, we are able to write code that works both on Windows and Unix-based (Linux and macOS) systems. However, the API for listing local addresses is very different between systems. For this reason, we split this program into two: one for Windows and one for Unix-based systems.

We will address the Windows case first.