E-Book
28,79 €

Clojure High Performance Programming, Second Edition E-Book

Shantanu Kumar

0,0

28,79 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Fachliteratur
Sprache: Englisch

Beschreibung

Clojure treats code as data and has a macro system. It focuses on programming with immutable values and explicit progression-of-time constructs, which are intended to facilitate the development of more robust programs, particularly multithreaded ones. It is built with performance, pragmatism, and simplicity in mind. Like most general purpose languages, various Clojure features have different performance characteristics that one should know in order to write high performance code.
This book shows you how to evaluate the performance implications of various Clojure abstractions, discover their underpinnings, and apply the right approach for optimum performance in real-world programs.
It starts by helping you classify various use cases and the need for them with respect to performance and analysis of various performance aspects. You will also learn the performance vocabulary that experts use throughout the world and discover various Clojure data structures, abstractions, and their performance characteristics. Further, the book will guide you through enhancing performance by using Java interoperability and JVM-specific features from Clojure. It also highlights the importance of using the right concurrent data structure and Java concurrency abstractions.
This book also sheds light on performance metrics for measuring, how to measure, and how to visualize and monitor the collected data. At the end of the book, you will learn to run a performance profiler, identify bottlenecks, tune performance, and refactor code to get a better performance.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 275

Veröffentlichungsjahr: 2015

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Denke (nach) und werde reich

Napoleon Hill

30 Minuten Resilienz

Ulrich Siegrist

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Die Hormonrevolution

Michael E Platt

Der Crash ist die Lösung

Matthias Weik

Günter, der innere Schweinehund, lernt verkaufen

Stefan Frädrich

Mission erfüllt

Owen Mark

Die Leber wächst mit ihren Aufgaben

Dr. med. Eckart von Hirschhausen

Macht, was ihr liebt!

Anja Förster

Kopf schlägt Kapital

Günter Faltin

Der größte Raubzug der Geschichte

Matthias Weik

Der Mann und das Holz

Lars Mytting

Unsere Hunde - gesund durch Homöopathie

Hans Günter Wolff

Die Jahrhundertlüge, die nur Insider kennen

Heiko Schrang

Organisation für Komplexität

Niels Pfläging

Power: Die 48 Gesetze der Macht

Robert Greene

The Truth About Employee Engagement

Patrick M. Lencioni

BLACKOUT - Morgen ist es zu spät

Marc Elsberg

Leseprobe

Clojure High Performance Programming Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Errata

Piracy

eBooks, discount offers, and more

Questions

1. Performance by Design

Use case classification

The user-facing software

Computational and data-processing tasks

A CPU bound computation

A memory bound task

A cache bound task

An input/output bound task

Online transaction processing

Online analytical processing

Batch processing

A structured approach to the performance

The performance vocabulary

Latency

Throughput

Bandwidth

Baseline and benchmark

Profiling

Performance optimization

Concurrency and parallelism

Resource utilization

Workload

The latency numbers that every programmer should know

Summary

2. Clojure Abstractions

Non-numeric scalars and interning

Identity, value, and epochal time model

Variables and mutation

Collection types

Persistent data structures

Constructing lesser-used data structures

Complexity guarantee

O(<7) implies near constant time

The concatenation of persistent data structures

Sequences and laziness

Laziness

Laziness in data structure operations

Constructing lazy sequences

Custom chunking

Macros and closures

Transducers

Performance characteristics

Transients

Fast repetition

Performance miscellanea

Disabling assertions in production

Destructuring

Recursion and tail-call optimization (TCO)

Premature end of iteration

Multimethods versus protocols

Inlining

Summary

3. Leaning on Java

Inspecting the equivalent Java source for Clojure code

Creating a new project

Compiling the Clojure sources into Java bytecode

Decompiling the .class files into Java source

Compiling the Clojure source without locals clearing

Numerics, boxing, and primitives

Arrays

Reflection and type hints

An array of primitives

Primitives

Macros and metadata

String concatenation

Miscellaneous

Using array/numeric libraries for efficiency

HipHip

primitive-math

Detecting boxed math

Resorting to Java and native code

Proteus – mutable locals in Clojure

Summary

4. Host Performance

The hardware

Processors

Branch prediction

Instruction scheduling

Threads and cores

Memory systems

Cache

Interconnect

Storage and networking

The Java Virtual Machine

The just-in-time compiler

Memory organization

HotSpot heap and garbage collection

Measuring memory (heap/stack) usage

Determining program workload type

Tackling memory inefficiency

Measuring latency with Criterium

Criterium and Leiningen

Summary

5. Concurrency

Low-level concurrency

Hardware memory barrier (fence) instructions

Java support and the Clojure equivalent

Atomic updates and state

Atomic updates in Java

Clojure's support for atomic updates

Faster writes with atom striping

Asynchronous agents and state

Asynchrony, queueing, and error handling

Why you should use agents

Nesting

Coordinated transactional ref and state

Ref characteristics

Ref history and in-transaction deref operations

Transaction retries and barging

Upping transaction consistency with ensure

Lesser transaction retries with commutative operations

Agents can participate in transactions

Nested transactions

Performance considerations

Dynamic var binding and state

Validating and watching the reference types

Java concurrent data structures

Concurrent maps

Concurrent queues

Clojure support for concurrent queues

Concurrency with threads

JVM support for threads

Thread pools in the JVM

Clojure concurrency support

Future

Promise

Clojure parallelization and the JVM

Moore's law

Amdahl's law

Universal Scalability Law

Clojure support for parallelization

pmap

pcalls

pvalues

Java 7's fork/join framework

Parallelism with reducers

Reducible, reducer function, reduction transformation

Realizing reducible collections

Foldable collections and parallelism

Summary

6. Measuring Performance

Performance measurement and statistics

A tiny statistics terminology primer

Median, first quartile, third quartile

Percentile

Variance and standard deviation

Understanding Criterium output

Guided performance objectives

Performance testing

The test environment

What to test

Measuring latency

Comparative latency measurement

Latency measurement under concurrency

Measuring throughput

Average throughput test

The load, stress, and endurance tests

Performance monitoring

Monitoring through logs

Ring (web) monitoring

Introspection

JVM instrumentation via JMX

Profiling

OS and CPU/cache-level profiling

I/O profiling

Summary

7. Performance Optimization

Project setup

Software versions

Leiningen project.clj configuration

Enable reflection warning

Enable optimized JVM options when benchmarking

Distinguish between initialization and runtime

Identifying performance bottlenecks

Latency bottlenecks in Clojure code

Measure only when it is hot

Garbage collection bottlenecks

Threads waiting at GC safepoint

Using jstat to probe GC details

Inspecting generated bytecode for Clojure source

Throughput bottlenecks

Profiling code with VisualVM

The Monitor tab

The Threads tab

The Sampler tab

Setting the thread name

The Profiler tab

The Visual GC tab

The Alternate profilers

Performance tuning

Tuning Clojure code

CPU/cache bound

Memory bound

Multi-threaded

I/O bound

JVM tuning

Back pressure

Summary

8. Application Performance

Choosing libraries

Making a choice via benchmarks

Web servers

Web routing libraries

Data serialization

JSON serialization

JDBC

Logging

Why SLF4J/LogBack?

The setup

Dependencies

The logback configuration file

Optimization

Data sizing

Reduced serialization

Chunking to reduce memory pressure

Sizing for file/network operations

Sizing for JDBC query results

Resource pooling

JDBC resource pooling

I/O batching and throttling

JDBC batch operations

Batch support at API level

Throttling requests to services

Precomputing and caching

Concurrent pipelines

Distributed pipelines

Applying back pressure

Thread pool queues

Servlet containers such as Tomcat and Jetty

HTTP Kit

Aleph

Performance and queueing theory

Little's law

Performance tuning with respect to Little's law

Summary

Index

Clojure High Performance Programming Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: November 2013

Second edition: September 2015

Production reference: 1230915

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78528-364-2

www.packtpub.com

Credits

Author

Shantanu Kumar

Reviewers

Eduard Bondarenko

Matjaz Gregoric

Commissioning Editor

Nadeem Bagban

Acquisition Editor

Larissa Pinto

Content Development Editor

Divij Kotian

Technical Editor

Anushree Arun Tendulkar

Copy Editor

Yesha Gangani

Project Coordinator

Nikhil Nair

Proofreader

Safis Editing

Indexer

Tejal Soni

Graphics

Abhinash Sahu

Production Coordinator

Manu Joseph

Cover Work

Manu Joseph

About the Author

Shantanu Kumar is a software developer living in Bengaluru, India. He works with Concur Technologies as a principal engineer, building a next-generation stack in Clojure. He started learning computer programming when he was at school, and has dabbled in several programming languages and software technologies. Having used Java for a long time, he discovered Clojure in early 2009 and has been a fan of it ever since.

When not busy with programming or reading up on technical stuff, he enjoys reading non-fiction and cycling around Bengaluru. Shantanu is an active participant in The Bangalore Clojure Users Group, and contributes to several open source Clojure projects on GitHub. He is also the author of the first edition of the book Clojure High Performance Programming, Packt Publishing.

I am grateful to my colleagues, Saju Pillai and Vijay Mathew, at Concur India for imparting marathon performance analysis/tuning sessions. I appreciate the input received from Andy Fingerhut and Zach Tellman on certain topics during the course of writing the second edition of the book. I also want to thank the technical reviewers and the team at Packt for their valuable input and support.

Writing this book has been an arduous task. I want to thank my family for putting up with me while I was immersed in this book for far too many days and weekends. If not for their support, I would not have been able to do justice to the book.

About the Reviewers

Eduard Bondarenko is a software developer living in Kiev, Ukraine. He started programming using Basic on ZXSpectrum a long time ago. Later, he worked professionally in the web development domain.

Eduard used Ruby on Rails for many years. Having used Ruby for a long time, he discovered Clojure in early 2009 and liked the language. Besides Ruby and Clojure, he is also interested in Erlang, Scala languages, machine learning, and logic programming.

Matjaz Gregoric is a software developer living in Ljubljana, Slovenia, with his wife and two children. He has a BS degree in physics, and has been developing software professionally since 2007.

During his career, Matjaz worked on various projects where he was able to get familiar with different technologies and programming languages. In 2010, he got familiar with Clojure and immediately fell in love with it. He is currently working on scalable distributed systems and complex web UIs.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

Preface

Since the first edition of this book was published in November 2013, Clojure has seen a much wider adoption and has witnessed many success stories. The newer versions of Clojure fortify its performance story while staying close to its roots—simplicity and pragmatism. This edition significantly updates the book for Clojure 1.7, and adds a new chapter on the performance measurement.

The Java Virtual Machine plays a huge role in the performance of the Clojure programs. This edition of the book increases the focus on the JVM tools for performance management, and it explores how to use those. This book is updated to use Java 8, though it also highlights the places where features from Java 7 or 8 have been used.

This book is updated mainly to be more of practical use to the readers. I hope that this edition will better equip the readers with performance measurement and profiling tools and with the know-how of analyzing and tuning the performance characteristics of Clojure code.

What this book covers

Chapter 1, Performance by Design, classifies the various use cases with respect to performance, and analyzes how to interpret their performance aspects and needs.

Chapter 2, Clojure Abstractions, is a guided tour of various Clojure data structures, abstractions (persistent data structures, vars, macros, transducers, and so on), and their performance characteristics.

Chapter 3, Leaning on Java, discusses how to enhance performance by using Java interoperability and features from Clojure.

Chapter 4, Host Performance, discusses how the host stack impacts performance. Being a hosted language, Clojure has its performance directly related to the host.

Chapter 5, Concurrency, is an advanced chapter that discusses the concurrency and parallelism features in Clojure and JVM. Concurrency is an increasingly significant way to derive performance.

Chapter 6, Measuring Performance, covers various aspects of performance benchmarks and measuring other factors.

Chapter 7, Performance Optimization, discusses systematic steps that need to be taken in order to identify that the performance bottlenecks obtain good performance.

Chapter 8, Application Performance, discusses building applications for performance. This involves dealing with external subsystems and factors that impact the overall performance.

What you need for this book

You should acquire Java Development Kit version 8 or higher for your operating system to work through all the examples. This book discusses the Oracle HotSpot JVM, so you may want to get Oracle JDK or OpenJDK (or Zulu) if possible. You should also get the latest Leiningen version (2.5.2 as of the time of writing) from http://leiningen.org/, and JD-GUI from http://jd.benow.ca/.

Who this book is for

This book is for intermediate Clojure programmers who are interested in learning how to write high-performance code. If you are an absolute beginner in Clojure, you should learn the basics of the language first, and then come back to this book. You need not be well-versed in performance engineering or Java. However, some prior knowledge of Java would make it much easier to understand the Java-related chapters.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

eBooks, discount offers, and more

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Performance by Design

Clojure is a safe, functional programming language that brings great power and simplicity to the user. Clojure is also dynamically and strongly typed, and has very good performance characteristics. Naturally, every activity performed on a computer has an associated cost. What constitutes acceptable performance varies from one use-case and workload to another. In today's world, performance is even the determining factor for several kinds of applications. We will discuss Clojure (which runs on the JVM (Java Virtual Machine)), and its runtime environment in the light of performance, which is the goal of this book.

The performance of Clojure applications depend on various factors. For a given application, understanding its use cases, design and implementation, algorithms, resource requirements and alignment with the hardware, and the underlying software capabilities is essential. In this chapter, we will study the basics of performance analysis, including the following:

Classifying the performance anticipations by the use cases typesOutlining the structured approach to analyze performanceA glossary of terms, commonly used to discuss performance aspectsThe performance numbers that every programmer should know

Use case classification

The performance requirements and priority vary across the different kinds of use cases. We need to determine what constitutes acceptable performance for the various kinds of use cases. Hence, we classify them to identify their performance model. When it comes to details, there is no sure shot performance recipe for any kind of use case, but it certainly helps to study their general nature. Note that in real life, the use cases listed in this section may overlap with each other.

The user-facing software

The performance of user-facing applications is strongly linked to the user's anticipation. Having a difference of a good number of milliseconds may not be perceptible for the user but at the same time, a wait of more than a few seconds may not be taken kindly. One important element in normalizing anticipation is to engage the user by providing duration-based feedback. A good idea to deal with such a scenario would be to start the task asynchronously in the background, and poll it from the UI layer to generate a duration-based feedback for the user. Another way could be to incrementally render the results to the user to even out the anticipation.

Anticipation is not the only factor in user facing performance. Common techniques like staging or precomputation of data, and other general optimization techniques can go a long way to improve the user experience with respect to performance. Bear in mind that all kinds of user facing interfaces fall into this use case category—the Web, mobile web, GUI, command line, touch, voice-operated, gesture...you name it.

Computational and data-processing tasks

Non-trivial compute intensive tasks demand a proportional amount of computational resources. All of the CPU, cache, memory, efficiency and the parallelizability of the computation algorithms would be involved in determining the performance. When the computation is combined with distribution over a network or reading from/staging to disk, I/O bound factors come into play. This class of workloads can be further subclassified into more specific use cases.

A CPU bound computation

A CPU bound computation is limited by the CPU cycles spent on executing it. Arithmetic processing in a loop, small matrix multiplication, determining whether a number is a Mersenne prime, and so on, would be considered CPU bound jobs. If the algorithm complexity is linked to the number of iterations/operations N, such as O(N), O(N2) and more, then the performance depends on how big N is, and how many CPU cycles each step takes. For parallelizable algorithms, performance of such tasks may be enhanced by assigning multiple CPU cores to the task. On virtual hardware, the performance may be impacted if the CPU cycles are available in bursts.

A memory bound task

A memory bound task is limited by the availability and bandwidth of the memory. Examples include large text processing, list processing, and more. For example, specifically in Clojure, the (reduce f (pmap g coll)) operation would be memory bound if coll is a large sequence of big maps, even though we parallelize the operation using pmap here. Note that higher CPU resources cannot help when memory is the bottleneck, and vice versa. Lack of availability of memory may force you to process smaller chunks of data at a time, even if you have enough CPU resources at your disposal. If the maximum speed of your memory is X and your algorithm on single the core accesses the memory at speed X/3, the multicore performance of your algorithm cannot exceed three times the current performance, no matter how many CPU cores you assign to it. The memory architecture (for example, SMP and NUMA) contributes to the memory bandwidth in multicore computers. Performance with respect to memory is also subject to page faults.

A cache bound task

A task is cache bound when its speed is constrained by the amount of cache available. When a task retrieves values from a small number of repeated memory locations, for example a small matrix multiplication, the values may be cached and fetched from there. Note that CPUs (typically) have multiple layers of cache, and the performance will be at its best when the processed data fits in the cache, but the processing will still happen, more slowly, when the data does not fit into the cache. It is possible to make the most of the cache using cache-oblivious algorithms. A higher number of concurrent cache/memory bound threads than CPU cores is likely to flush the instruction pipeline, as well as the cache at the time of context switch, likely leading to a severely degraded performance.

An input/output bound task

An input/output (I/O) bound task would go faster if the I/O subsystem, that it depends on, goes faster. Disk/storage and network are the most commonly used I/O subsystems in data processing, but it can be serial port, a USB-connected card reader, or any I/O device. An I/O bound task may consume very few CPU cycles. Depending on the speed of the device, connection pooling, data compression, asynchronous handling, application caching, and more, may help in performance. One notable aspect of I/O bound tasks is that performance is usually dependent on the time spent waiting for connection/seek, and the amount of serialization that we do, and hardly on the other resources.

In practice, many data processing workloads are usually a combination of CPU bound, memory bound, cache bound, and I/O bound tasks. The performance of such mixed workloads effectively depends on the even distribution of CPU, cache, memory, and I/O resources over the duration of the operation. A bottleneck situation arises only when one resource gets too busy to make way for another.

Online transaction processing

Online transaction processing (OLTP) systems process the business transactions on demand. They can sit behind systems such as a user-facing ATM machine, point-of-sale terminal, a network-connected ticket counter, ERP systems, and more. The OLTP systems are characterized by low latency, availability, and data integrity. They run day-to-day business transactions. Any interruption or outage is likely to have a direct and immediate impact on sales or service. Such systems are expected to be designed for resiliency rather than delayed recovery from failures. When the performance objective is unspecified, you may like to consider graceful degradation as a strategy.

It is a common mistake to ask the OLTP systems to answer analytical queries, something that they are not optimized for. It is desirable for an informed programmer to know the capability of the system, and suggest design changes as per the requirements.

Online analytical processing

Online analytical processing (OLAP) systems are designed to answer analytical queries in a short time. They typically get data from the OLTP operations, and their data model is optimized for querying. They basically provide for consolidation (roll-up), drill-down and slicing and dicing of data for analytical purposes. They often use specialized data stores that can optimize ad-hoc analytical queries on the fly. It is important for such databases to provide pivot-table like capability. Often, the OLAP cube is used to get fast access to the analytical data.

Feeding the OLTP data into the OLAP systems may entail workflows and multistage batch processing. The performance concern of such systems is to efficiently deal with large quantities of data while also dealing with inevitable failures and recovery.

Batch processing

Batch processing is automated execution of predefined jobs. These are typically bulk jobs that are executed during off-peak hours. Batch processing may involve one or more stages of job processing. Often batch processing is clubbed with workflow automation, where some workflow steps are executed offline. Many of the batch processing jobs work on staging of data, and on preparing data for the next stage of processing to pick up.

Batch jobs are generally optimized for the best utilization of the computing resources. Since there is little to moderate the demand to lower the latencies of some particular subtasks, these systems tend to optimize for throughput. A lot of batch jobs involve largely I/O processing and are often distributed over a cluster. Due to distribution, the data locality is preferred when processing the jobs; that is, the data and processing should be local in order to avoid network latency in reading/writing data.

A structured approach to the performance

In practice, the performance of non-trivial applications is rarely a function of coincidence or prediction. For many projects, performance is not an option (it is rather a necessity), which is why this is even more important today. Capacity planning, determining performance objectives, performance modeling, measurement, and monitoring are key.

Tuning a poorly designed system to perform is significantly harder, if not practically impossible, than having a system well-designed from the start. In order to meet a performance goal, performance objectives should be known before the application is designed. The performance objectives are stated in terms of latency, throughput, resource utilization, and workload. These terms are discussed in the following section in this chapter.

The resource cost can be identified in terms of application scenarios, such as browsing of products, adding products to shopping cart, checkout, and more. Creating workload profiles that represent users performing various operations is usually helpful.

Performance modeling is a reality check for whether the application design will support the performance objectives. It includes performance objectives, application scenarios, constraints, measurements (benchmark results), workload objectives and if available, the performance baseline. It is not a replacement for measurement and load testing, rather, the model is validated using these. The performance model may include the performance test cases to assert the performance characteristics of the application scenarios.

Deploying an application to production almost always needs some form of capacity planning. It has to take into account the performance objectives for today and for the foreseeable future. It requires an idea of the application architecture, and an understanding of how the external factors translate into the internal workload. It also requires informed expectations about the responsiveness and the level of service to be provided by the system. Often, capacity planning is done early in a project to mitigate the risk of provisioning delays.

The performance vocabulary

There are several technical terms that are heavily used in performance engineering. It is important to understand these, as they form the cornerstone of the performance-related discussions. Collectively, these terms form a performance vocabulary. The performance is usually measured in terms of several parameters, where every parameter has roles to play—such parameters are a part of the vocabulary.

Latency

Latency is the time taken by an individual unit of work to complete the task. It does not imply successful completion of a task. Latency is not collective, it is linked to a particular task. If two similar jobs—j1 and j2 took 3 ms and 5 ms respectively, their latencies would be treated as such. If j1 and j2 were dissimilar tasks, it would have made no difference. In many cases the average latency of similar jobs is used in the performance objectives, measurement, and monitoring results.

Latency is an important indicator of the health of a system. A high performance system often thrives on low latency. Higher than normal latency can be caused due to load or bottleneck. It helps to measure the latency distribution during a load test. For example, if more than 25 percent of similar jobs, under a similar load, have significantly higher latency than others, then it may be an indicator of a bottleneck scenario that is worth investigating.

When a task called j1 consists of smaller tasks called j2, j3, and j4, the latency of j1 is not necessarily the sum of the latencies of each of j2, j3, and j4. If any of the subtasks of j1 are concurrent with another, the latency of j1 will turn out to be less than the sum of the latencies of j2, j3, and j4. The I/O bound tasks are generally more prone to higher latency. In network systems, latency is commonly based on the round-trip to another host, including the latency from source to destination, and then back to source.

Throughput

Throughput is the number of successful tasks or operations performed in a unit of time. The top-level operations performed in a unit of time are usually of a similar kind, but with a potentially different from latencies. So, what does throughput tell us about the system? It is the rate at which the system is performing. When you perform load testing, you can determine the maximum rate at which a particular system can perform. However, this is not a guarantee of the conclusive, overall, and maximum rate of performance.

Throughput is one of the factors that determine the scalability of a system. The throughput of a higher level task depends on the capacity to spawn multiple such tasks in parallel, and also on the average latency of those tasks. The throughput should be measured during load testing and performance monitoring to determine the peak-measured throughput, and the maximum-sustained throughput. These factors contribute to the scale and performance of a system.

Bandwidth

Bandwidth is the raw data rate over a communication channel, measured in a certain number of bits per second. This includes not only the payload, but also all the overhead necessary to carry out the communication. Some examples are: Kbits/sec, Mbits/sec, and more. An uppercase B such as KB/sec denotes Bytes, as in kilobytes per second. Bandwidth is often compared to throughput. While bandwidth is the raw capacity, throughput for the same system is the successful task completion rate, which usually involves a round-trip. Note that throughput is for an operation that involves latency. To achieve maximum throughput for a given bandwidth, the communication/protocol overhead and operational latency should be minimal.

For storage systems (such as hard disks, solid-state drives, and more) the predominant way to measure performance is IOPS (Input-output per second), which is multiplied by the transfer size and represented as bytes per second, or further into MB/sec, GB/sec, and more. IOPS is usually derived for sequential and random workloads for read/write operations.

Mapping the throughput of a system to the bandwidth of another may lead to dealing with an impedance mismatch between the two. For example, an order processing system may perform the following tasks:

Transact with the database on disk Post results over the network to an external system

Depending on the bandwidth of the disk sub-system, the bandwidth of the network, and the execution model of order processing, the throughput may depend not only on the bandwidth of the disk sub-system and network, but also on how loaded they currently are. Parallelism and pipelining are common ways to increase the throughput over a given bandwidth.

Baseline and benchmark

The performance baseline, or simply baseline, is the reference point, including measurements of well-characterized and understood performance parameters for a known configuration. The baseline is used to collect performance measurements for the same parameters that we may benchmark later for another configuration. For example, collecting "throughput distribution over 10 minutes at a load of 50 concurrent threads" is one such performance parameter that we can use for baseline and benchmarking. A baseline is recorded together with the hardware, network, OS and JVM configuration.

The performance benchmark