34,79 €
Scala is a modern, multiparadigm programming language designed to express common programming patterns in a concise, elegant, and type-safe way. Scala smoothly integrates the features of object-oriented and functional languages.
In this second edition, you will find updated coverage of the Scala 2.12 platform. The Scala 2.12 series targets Java 8 and requires it for execution. The book starts by introducing you to the foundations of concurrent programming on the JVM, outlining the basics of the Java Memory Model, and then shows some of the classic building blocks of concurrency, such as the atomic variables, thread pools, and concurrent data structures, along with the caveats of traditional concurrency.
The book then walks you through different high-level concurrency abstractions, each tailored toward a specific class of programming tasks, while touching on the latest advancements of async programming capabilities of Scala. It also covers some useful patterns and idioms to use with the techniques described. Finally, the book presents an overview of when to use which concurrency library and demonstrates how they all work together, and then presents new exciting approaches to building concurrent and distributed systems.
Who this book is written for
If you are a Scala programmer with no prior knowledge of concurrent programming, or seeking to broaden your existing knowledge about concurrency, this book is for you. Basic knowledge of the Scala programming language will be helpful.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 626
Veröffentlichungsjahr: 2017
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: November 2014
Second edition: February 2017
Production reference: 1170217
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78646-689-1
www.packtpub.com
Author
Aleksandar Prokopec
Copy Editor
Safis Editing
Reviewers
Vikash Sharma
Dominik Gruntz
Zhen Li
Lukas Rytz
Michel Schinz
Samira Tasharofi
Project Coordinator
Vaidehi Sawant
Commissioning Editor
Aaron Lazar
Proofreader
Safis Editing
Acquisition Editor
Sonali Vernekar
Indexer
Aishwarya Gangawane
Content Development Editor
Rohit Kumar Singh
Graphics
Jason Monteiro
Technical Editor
Pavan Ramchandani
Production Coordinator
Shantanu Zagade
Concurrent and parallel programming have progressed from niche disciplines, of interest only to kernel programming and high-performance computing, to something that every competent programmer must know. As parallel and distributed computing systems are now the norm, most applications are concurrent, be it for increasing the performance or for handling asynchronous events.
So far, most developers are unprepared to deal with this revolution. Maybe they have learned the traditional concurrency model, which is based on threads and locks, in school, but this model has become inadequate for dealing with massive concurrency in a reliable manner and with acceptable productivity. Indeed, threads and locks are hard to use and harder to get right. To make progress, one needs to use concurrency abstractions that are at a higher level and composable.
15 years ago, I worked on a predecessor of Scala: Funnel was an experimental programming language that had concurrent semantics at its core. All the programming concepts were explained in this language as syntactic sugar on top of functional nets, an object-oriented variant of join calculus . Even though join calculus is a beautiful theory, we realized after some experimentation that the concurrency problem is more multifaceted than what can be comfortably expressed in a single formalism. There is no silver bullet for all concurrency issues; the right solution depends on what one needs to achieve. Do you want to define asynchronous computations that react to events or streams of values? Or have autonomous, isolated entities communicating via messages? Or define transactions over a mutable store? Or, maybe the primary purpose of parallel execution is to increase the performance? For each of these tasks, there is an abstraction that does the job: futures, reactive streams, actors, transactional memory, or parallel collections.
This brings us to Scala and this book. As there are so many useful concurrency abstractions, it seems unattractive to hardcode them all in a programming language. The purpose behind the work on Scala was to make it easy to define high-level abstractions in user code and libraries. This way, one can define the modules handling the different aspects of concurrent programming. All of these modules would be built on a low-level core that is provided by the host system. In retrospect, this approach has worked well. Today, Scala has some of the most powerful and elegant libraries for concurrent programming. This book will take you on a tour of the most important ones, explaining the use case for each and the application patterns.
This book could not have a more expert author. Aleksandar Prokopec contributed to some of the most popular Scala libraries for concurrent and parallel programming. He also invented some of the most intricate data structures and algorithms. With this book, he created a readable tutorial at the same time and an authoritative reference for the area that he had worked in. I believe that Learning Concurrent Programming in Scala, Second Edition will be a mandatory reading for everyone who writes concurrent and parallel programs in Scala. I also expect to see it on the bookshelves of many people who just want to find out about this fascinating and fast moving area of computing.
Martin Odersky
Professor at EPFL, the creator of Scala
Aleksandar Prokopec, who also authored the first edition of this book, is a concurrent and distributed programming researcher. He holds a PhD in computer science from the École Polytechnique Fédérale de Lausanne, Switzerland. He has worked at Google and is currently a principal researcher at Oracle Labs.
As a member of the Scala team at EPFL, Aleksandar actively contributed to the Scala programming language, and he has worked on programming abstractions for concurrency, data-parallel programming support, and concurrent data structures for Scala. He created the Scala Parallel Collections framework, which is a library for high-level data-parallel programming in Scala, and participated in working groups for Scala concurrency libraries, such as Futures, Promises, and ScalaSTM. Aleksandar is the primary author of the reactor programming model for distributed computing.
First of all, I would like to thank my reviewers, Samira Tasharofi, Lukas Rytz, Dominik Gruntz, Michel Schinz, Zhen Li, and Vladimir Kostyukov for their excellent feedback and useful comments. I would also like to thank the editors at Packt, Kevin Colaco, Sruthi Kutty, Kapil Hemnani, Vaibhav Pawar, and Sebastian Rodrigues for their help with writing this book. It really was a pleasure to work with these people.
The concurrency frameworks described in this book wouldn’t have seen the light of the day without a collaborative effort of a large number of people. Many individuals have somehow, directly or indirectly, contributed to the development of these utilities. These people are the true heroes of Scala concurrency, and they are to thank for Scala’s excellent support for concurrent programming. It is difficult to enumerate all of them here, but I tried my best. If somebody feels left out, they should ping me, and, they’ll probably appear in the next edition of this book.
It goes without saying that Martin Odersky is to thank for creating the Scala programming language, which was used as a platform for the concurrency frameworks described in this book. Special thanks goes to him, all the people that were part of the Scala team at the EPFL through the last 10 or more years, and the people at Typesafe, who are working hard to keep Scala one of the best general purpose languages out there.
Most of the Scala concurrency frameworks rely on the work of Doug Lea, in one way or another. His Fork/Join framework underlies the implementation of the Akka actors, Scala Parallel Collections, and the Futures and Promises library, and many of the JDK concurrent data structures described in this book are his own implementation. Many of the Scala concurrency libraries were influenced by his advice.
The Scala Futures and Promises library was initially designed by Philipp Haller, Heather Miller, Vojin Jovanović, and me from the EPFL, Viktor Klang and Roland Kuhn from the Akka team, and Marius Eriksen from Twitter, with contributions from Havoc Pennington, Rich Dougherty, Jason Zaugg, Doug Lea, and many others.
Although I was the main author of the Scala Parallel Collections, this library benefited from the input of many different people, including Phil Bagwell, Martin Odersky, Tiark Rompf, Doug Lea, and Nathan Bronson. Later on, Dmitry Petrashko and I started working on an improved version of parallel and standard collection operations, optimized through the use of Scala Macros. Eugene Burmako and Denys Shabalin are one of the main contributors to the Scala Macros project.
The work on the Rx project was started by Erik Meijer, Wes Dyer, and the rest of the Rx team. Since its original .NET implementation, the Rx framework has been ported to many different languages, including Java, Scala, Groovy, JavaScript, and PHP, and has gained widespread adoption thanks to the contributions and the maintenance work of Ben Christensen, Samuel Grütter, Shixiong Zhu, Donna Malayeri, and many other people.
Nathan Bronson is one of the main contributors to the ScalaSTM project, whose default implementation is based on Nathan’s CCSTM project. The ScalaSTM API was designed by the ScalaSTM expert group, composed of Nathan Bronson, Jonas Bonér, Guy Korland, Krishna Sankar, Daniel Spiewak, and Peter Veentjer.
The initial Scala actor library was inspired by the Erlang actor model and developed by Philipp Haller. This library inspired Jonas Bonér to start the Akka actor framework. The Akka project had many contributors, including Viktor Klang, Henrik Engström, Peter Vlugter, Roland Kuhn, Patrik Nordwall, Björn Antonsson, Rich Dougherty, Johannes Rudolph, Mathias Doenitz, Philipp Haller, and many others.
Finally, I would like to thank the entire Scala community for their contributions, and for making Scala an awesome programming language.
Vikash Sharma is a software developer and open source technology evangelist, located in India. He tries to keep things simple and that helps him writing clean and manageable code. He has authored a video course for Scala. He is employed as an associate consultant with Infosys and has also worked as a Scala developer.
Thank you would not suffice for the support I got from my family, Mom, Dad and Brother. I really want to appreciate everyone who were there when I needed them the most. Special thanks to Vijay Athikesavan for passing to me the insights he had for coding.
Dominik Gruntz has a PhD from ETH Zürich and has been a Professor of Computer Science at the University of Applied Sciences FHNW since 2000. Besides his research projects, he teaches a course on concurrent programming. Some years ago, the goal of this course was to convince the students that writing correct concurrent programs is too complicated for mere mortals (an educational objective that was regularly achieved). With the availability of high-level concurrency frameworks in Java and Scala, this has changed, and this book, Learning Concurrent Programming in Scala, is a great resource for all programmers who want to learn how to write correct, readable, and efficient concurrent programs. This book is the ideal textbook for a course on concurrent programming.
Thanks that I could support this project as a reviewer.
Zhen Li acquired an enthusiasm of computing early in elementary school when she first learned Logo. After earning a Software Engineering degree at Fudan University in Shanghai, China and a Computer Science degree from University College Dublin, Ireland, she moved to the University of Georgia in the United States for her doctoral tudy and research. She focused on psychological aspects of programmers' learning behaviors, especially the way programmers understand concurrent programs. Based on the research, she aimed to develop effective software engineering methods and teaching paradigms to help programmers embrace concurrent programs.
Zhen Li had practical teaching experience with undergraduate students on a variety of computer science topics, including system and network programming, modeling and simulation, as well as human-computer interaction. Her major contributions in teaching computer programming were to author syllabi and offer courses with various programming languages and multiple modalities of concurrency that encouraged students to actively acquire software design philosophy and comprehensively learn programming concurrency.
Zhen Li also had a lot of working experience in industrial innovations. She worked in various IT companies, including Oracle, Microsoft, and Google, over the past 10 years, where she participated in the development of cutting-edge products, platforms and infrastructures for core enterprise, and Cloud business technologies. Zhen Li is passionate about programming and teaching. You are welcome to contact her at [email protected].
Lukas Rytz is a compiler engineer working in the Scala team at Typesafe. He received his PhD from EPFL in 2013, and has been advised by Martin Odersky, the inventor of the Scala programming language.
Michel Schinz is a lecturer at EPFL.
Samira Tasharofi received her PhD in the field of Software Engineering from the University of Illinois at Urbana-Champaign. She has conducted research on various areas, such as testing concurrent programs and in particular actor programs, patterns in parallel programming, and verification of component-based systems.
Samira has reviewed several books, such as Actors in Scala, Parallel Programming with Microsoft .NET: Design Patterns for Decomposition and Coordination on Multicore Architectures (Patterns and Practices), and Parallel Programming with Microsoft Visual C++: Design Patterns for Decomposition and Coordination on Multicore Architectures (Patterns and Practices). She was also among the reviewers of the research papers for software engineering conferences, including ASE, AGERE, SPLASH, FSE, and FSEN. She has served as a PC member of the 4th International Workshop on Programming based on Actors, Agents, and Decentralized Control (AGERE 2014) and 6th IPM International Conference on Fundamentals of Software Engineering (FSEN 2015).
I would like to thank my husband and mom for their endless love and support.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://goo.gl/ldH2Nv.
If you'd like to join our team of regular reviewers, you can email us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Dedicated to Sasha,
She’s probably the only PhD in physical chemistry ever to read this book.
Concurrency is everywhere. With the rise of multicore processors in the consumer market, the need for concurrent programming has overwhelmed the developer world. Where it once served to express asynchronously in programs and computer systems and was largely an academic discipline, concurrent programming is now a pervasive methodology in software development. As a result, advanced concurrency frameworks and libraries are sprouting at an amazing rate. Recent years have witnessed a renaissance in the field of concurrent computing.
As the level of abstraction grows in modern languages and concurrency frameworks, it is becoming crucial to know how and when to use them. Having a good grasp of the classical concurrency and synchronization primitives, such as threads, locks, and monitors, is no longer sufficient. High-level concurrency frameworks, which solve many issues of traditional concurrency and are tailored towards specific tasks, are gradually overtaking the world of concurrent programming.
This book describes high-level concurrent programming in Scala. It presents detailed explanations of various concurrency topics and covers the basic theory of concurrent programming. Simultaneously, it describes modern concurrency frameworks, shows their detailed semantics, and teaches you how to use them. Its goal is to introduce important concurrency abstractions and, at the same time, show how they work in real code.
We are convinced that, by reading this book, you will gain both a solid theoretical understanding of concurrent programming and develop a set of useful practical skills that are required to write correct and efficient concurrent programs. These skills are the first steps toward becoming a modern concurrency expert.
We hope that you will have as much fun reading this book as we did writing it.
This book is organized into a sequence of chapters with various topics on concurrent programming. The book covers the fundamental concurrent APIs that are a part of the Scala runtime, introduces more complex concurrency primitives, and gives an extensive overview of high-level concurrency abstractions.
Chapter 1, Introduction, explains the need for concurrent programming and gives some philosophical background. At the same time, it covers the basics of the Scala programming language that are required for understanding the rest of this book.
Chapter 2, Concurrency on the JVM and the Java Memory Model, teaches you the basics of concurrent programming. This chapter will teach you how to use threads and how to protect access to shared memory and introduce the Java Memory Model.
Chapter 3, Traditional Building Blocks of Concurrency, presents classic concurrency utilities, such as thread pools, atomic variables, and concurrent collections, with a particular focus on the interaction with the features of the Scala language. The emphasis in this book is on the modern, high-level concurrent programming frameworks. Consequently, this chapter presents an overview of traditional concurrent programming techniques, but it does not aim to be extensive.
Chapter 4, Asynchronous Programming with Futures and Promises, is the first chapter that deals with a Scala-specific concurrency framework. This chapter presents the futures and promises API and shows how to correctly use them when implementing asynchronous programs.
Chapter 5, Data-Parallel Collections, describes the Scala parallel collections framework. In this chapter, you will learn how to parallelize collection operations, when it is allowed to parallelize them, and how to assess the performance benefits of doing so.
Chapter 6, Concurrent Programming with Reactive Extensions, teaches you how to use the Reactive Extensions framework for event-based and asynchronous programming. You will see how the operations on event streams correspond to collection operations, how to pass events from one thread to another, and how to design a reactive user interface using event streams.
Chapter 7, Software Transactional Memory, introduces the ScalaSTM library for transactional programming, which aims to provide a safer, more intuitive, shared-memory programming model. In this chapter, you will learn how to protect access to shared data using scalable memory transactions and, at the same time, reduce the risk of deadlocks and race conditions.
Chapter 8, Actors, presents the actor programming model and the Akka framework. In this chapter, you will learn how to transparently build message-passing distributed programs that run on multiple machines.
Chapter 9, Concurrency in Practice, summarizes the different concurrency libraries introduced in the earlier chapters. In this chapter, you will learn how to choose the correct concurrency abstraction to solve a given problem, and how to combine different concurrency abstractions together when designing larger concurrent applications.
Chapter 10, Reactors, presents the reactor programming model, whose focus is improved composition in concurrent and distributed programs. This emerging model enables separation of concurrent and distributed programming patterns into modular components called protocols.
While we recommend that you read the chapters in the order in which they appear, this is not strictly necessary. If you are well acquainted with the content in Chapter 2, Concurrency on the JVM and the Java Memory Model, you can study most of the other chapters directly. The only chapters that rely on the content from all the preceding chapters are Chapter 9, Concurrency in Practice, where we present a practical overview of the topics in this book, and Chapter 10, Reactors, for which it is helpful to understand how actors and event streams work.
In this section, we describe some of the requirements that are necessary to read and understand this book. We explain how to install the Java Development Kit, which is required to run Scala programs and show how to use Simple Build Tool to run various examples.
We will not require an IDE in this book. The program that you use to write code is entirely up to you, and you can choose anything, such as Vim, Emacs, Sublime Text, Eclipse, IntelliJ IDEA, Notepad++, or some other text editor.
Scala programs are not compiled directly to the native machine code, so they cannot be run as executables on various hardware platforms. Instead, the Scala compiler produces an intermediate code format called the Java bytecode. To run this intermediate code, your computer must have the Java Virtual Machine software installed. In this section, we explain how to download and install the Java Development Kit, which includes the Java Virtual Machine and other useful tools.
There are multiple implementations of the JDK that are available from different software vendors. We recommend that you use the Oracle JDK distribution. To download and install the Java Development Kit, follow these steps:
It is possible that your operating system already has JDK installed. To verify this, simply run the javac command, as we did in the last step in the preceding description.
Simple Build Tool (SBT) is a command-line build tool used for Scala projects. Its purpose is to compile Scala code, manage dependencies, continuous compilation and testing, deployment, and many other uses. Throughout this book, we will use SBT to manage our project dependencies and run example code.
To install SBT, follow these instructions:
You are now ready to use SBT. In the following steps, we will create a new SBT project:
Now, you can start writing Scala programs. Open your editor, and create a source code file named HelloWorld.scala in the src/main/scala/org/learningconcurrency directory. Add the following contents to the HelloWorld.scala file:
package org.learningconcurrency object HelloWorld extends App { println("Hello, world!") }Now, go back to the terminal window with the SBT interactive shell and run the program with the following command:
> runRunning this program should give the following output:
Hello, world!These steps are sufficient to run most of the examples in this book. Occasionally, we will rely on external libraries when running the examples. These libraries are resolved automatically by SBT from standard software repositories. For some libraries, we will need to specify additional software repositories, so we add the following lines to our build.sbt file:
resolvers ++= Seq( "Sonatype OSS Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots", "Sonatype OSS Releases" at "https://oss.sonatype.org/content/repositories/releases", "Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/" )Now that we have added all the necessary software repositories, we can add some concrete libraries. By adding the following line to the build.sbt file, we obtain access to the Apache Commons IO library:
libraryDependencies += "commons-io" % "commons-io" % "2.4"After changing the build.sbt file, it is necessary to reload any running SBT instances. In the SBT interactive shell, we need to enter the following command:
> reloadThis enables SBT to detect any changes in the build definition file and download additional software packages when necessary.
Different Scala libraries live in different namespaces called packages. To obtain access to the contents of a specific package, we use the import statement. When we use a specific concurrency library in an example for the first time, we will always show the necessary set of import statements. On subsequent uses of the same library, we will not repeat the same import statements.
Similarly, we avoid adding package declarations in the code examples to keep them short. Instead, we assume that the code in a specific chapter is in the similarly named package. For example, all the code belonging to Chapter 2, Concurrency on the JVM and the Java Memory Model, resides in the org.learningconcurrency.ch2 package. Source code files for the examples presented in that chapter begin with the following code:
package org.learningconcurrency package ch2Finally, this book deals with concurrency and asynchronous execution. Many of the examples start a concurrent computation that continues executing after the main execution stops. To make sure that these concurrent computations always complete, we will run most of the examples in the same JVM instance as SBT itself. We add the following line to our build.sbt file:
fork := falseIn the examples, where running in a separate JVM process is required, we will point this out and give clear instructions.
An advantage of using an Integrated Development Environment (IDE) such as Eclipse or IntelliJ IDEA is that you can write, compile, and run your Scala programs automatically. In this case, there is no need to install SBT, as described in the previous section. While we advise that you run the examples using SBT, you can alternatively use an IDE.
There is an important caveat when running the examples in this book using an IDE: editors such as Eclipse and IntelliJ IDEA run the program inside a separate JVM process. As mentioned in the previous section, certain concurrent computations continue executing after the main execution stops. To make sure that they always complete, you will sometimes need to add the sleep statements at the end of the main execution, which slow down the main execution. In most of the examples in this book, the sleep statements are already added for you, but in some programs, you might have to add them yourself.
This book is primarily intended for developers who have learned how to write sequential Scala programs, and wish to learn how to write correct concurrent programs. The book assumes that you have a basic knowledge of the Scala programming language. Throughout this book, we strive to use the simple features of Scala in order to demonstrate how to write concurrent programs. Even with an elementary knowledge of Scala, you should have no problem understanding various concurrency topics.
This is not to say that the book is limited to Scala developers. Whether you have experience with Java, come from a .NET background, or are generally a programming language aficionado, chances are that you will find the content in this book insightful. A basic understanding of object-oriented or functional programming should be a sufficient prerequisite.
Finally, this book is a good introduction to modern concurrent programming in the broader sense. Even if you have the basic knowledge about multithreaded computing, or the JVM concurrency model, you will learn a lot about modern, high-level concurrency utilities. Many of the concurrency libraries in this book are only starting to find their way into mainstream programming languages, and some of them are truly cutting-edge technologies.
Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/LearningConcurrentProgrammingInScalaSecondEdition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/LearningConcurrentProgrammingInScalaSecondEdition_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at [email protected] with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.
"For over a decade prophets have voiced the contention that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers."
--Gene Amdahl, 1967Although the discipline of concurrent programming has a long history, it has gained a lot of traction in recent years with the arrival of multi core processors. The recent development in computer hardware not only revived some classical concurrency techniques but also started a major paradigm shift in concurrent programming. At a time when concurrency is becoming so important, an understanding of concurrent programming is an essential skill for every software developer.
This chapter explains the basics of concurrent computing and presents some Scala preliminaries required for this book. Specifically, it does the following:
We will start by examining what concurrent programming is and why it is important.
In concurrent programming, we express a program as a set of concurrent computations that execute during overlapping time intervals and coordinate in some way. Implementing a concurrent program that functions correctly is usually much harder than implementing a sequential one. All the pitfalls present in sequential programming lurk in every concurrent program, but there are many other things that can go wrong, as we will learn in this book. A natural question arises: why bother? Can't we just keep writing sequential programs?
Concurrent programming has multiple advantages. First, increased concurrency can improve program performance. Instead of executing the entire program on a single processor, different subcomputations can be performed on separate processors, making the program run faster. With the spread of multicore processors, this is the primary reason why concurrent programming is nowadays getting so much attention.
A concurrent programming model can result in faster I/O operations. A purely sequential program must periodically poll I/O to check if there is any data input available from the keyboard, the network interface, or some other device. A concurrent program, on the other hand, can react to I/O requests immediately. For I/O-intensive operations, this results in improved throughput, and is one of the reasons why concurrent programming support existed in programming languages even before the appearance of multiprocessors. Thus, concurrency can ensure the improved responsiveness of a program that interacts with the environment.
Finally, concurrency can simplify the implementation and maintainability of computer programs. Some programs can be represented more concisely using concurrency. It can be more convenient to divide the program into smaller, independent computations than to incorporate everything into one large program. User interfaces, web servers, and game engines are typical examples of such systems.
In this book, we adopt the convention that concurrent programs communicate through the use of shared memory, and execute on a single computer. By contrast, a computer program that executes on multiple computers, each with its own memory, is called a distributed program, and the discipline of writing such programs is called distributed programming. Typically, a distributed program must assume that each of the computers can fail at any point, and provide some safety guarantees if this happens. We will mostly focus on concurrent programs, but we will also look at examples of distributed programs.
In a computer system, concurrency can manifest itself in the computer hardware, at the operating system level, or at the programming language level. We will focus mainly on programming-language-level concurrency.
The coordination of multiple executions in a concurrent system is called synchronization, and it is a key part in successfully implementing concurrency. Synchronization includes mechanisms used to order concurrent executions in time. Furthermore, synchronization specifies how concurrent executions communicate, that is, how they exchange information. In concurrent programs, different executions interact by modifying the shared memory subsystem of the computer. This type of synchronization is called shared memory communication. In distributed programs, executions interact by exchanging messages, so this type of synchronization is called message-passing communication.
At the lowest level, concurrent executions are represented by entities called processes and threads, covered in Chapter 2, Concurrency on the JVM and the Java Memory Model. Processes and threads traditionally use entities such as locks and monitors to order parts of their execution. Establishing an order between the threads ensures that the memory modifications done by one thread are visible to a thread that executes later.
Often, expressing concurrent programs using threads and locks is cumbersome. More complex concurrent facilities have been developed to address this, such as communication channels, concurrent collections, barriers, countdown latches, and thread pools. These facilities are designed to more easily express specific concurrent programming patterns, and some of them are covered in Chapter 3, Traditional Building Blocks of Concurrency.
Traditional concurrency is relatively low-level and prone to various kinds of errors, such as deadlocks, starvations, data races, and race conditions. You will usually not use low-level concurrency primitives when writing concurrent Scala programs. Still, a basic knowledge of low-level concurrent programming will prove invaluable in understanding high-level concurrency concepts later.
Modern concurrency paradigms are more advanced than traditional approaches to concurrency. Here, the crucial difference lies in the fact that a high-level concurrency framework expresses which goal to achieve, rather than how to achieve that goal.
In practice, the difference between low-level and high-level concurrency is less clear, and different concurrency frameworks form a continuum rather than two distinct groups. Still, recent developments in concurrent programming show a bias towards declarative and functional programming styles.
As we will see in Chapter 2, Concurrency on the JVM and the Java Memory Model, computing a value concurrently requires creating a thread with a custom run method, invoking the start method, waiting until the thread completes, and then inspecting specific memory locations to read the result. Here, what we really want to say is compute some value concurrently, and inform me when you are done. Furthermore, we would prefer to use a programming model that abstracts over the coordination details of the concurrent computation, to treat the result of the computation as if we already have it, rather than having to wait for it and then reading it from the memory. Asynchronous programming using futures is a paradigm designed to specifically support these kinds of statements, as we will learn in Chapter 4, Asynchronous Programming with Futures and Promises. Similarly, reactive programming using event streams aims to declaratively express concurrent computations that produce many values, as we will see in Chapter 6, Concurrent Programming with Reactive Extensions.
The declarative programming style is increasingly common in sequential programming too. Languages such as Python, Haskell, Ruby, and Scala express operations on their collections in terms of functional operators and allow statements such as filter all negative integers from this collection. This statement expresses a goal rather than the underlying implementation, allowing to it easy to parallelize such an operation behind the scene. Chapter 5, Data-Parallel Collections, describes the data-parallel collections framework available in Scala, which is designed to accelerate collection operations using multicores.
Another trend seen in high-level concurrency frameworks is specialization towards specific tasks. Software transactional memory technology is specifically designed to express memory transactions and does not deal with how to start concurrent executions at all. A memory transaction is a sequence of memory operations that appear as if they either execute all at once or do not execute at all. This is similar to the concept of database transactions. The advantage of using memory transactions is that this avoids a lot of errors typically associated with low-level concurrency. Chapter 7, Software Transactional Memory, explains software transactional memory in detail.
Finally, some high-level concurrency frameworks aim to transparently provide distributed programming support as well. This is especially true for data-parallel frameworks and message-passing concurrency frameworks, such as the actors described in Chapter 8, Actors.
Although Scala is still a language on the rise, it has yet to receive the wide-scale adoption of a language such as Java; nonetheless its support for concurrent programming is rich and powerful. Concurrency frameworks for nearly all the different styles of concurrent programming exist in the Scala ecosystem and are being actively developed. Throughout its development, Scala has pushed the boundaries when it comes to providing modern, high-level application programming interfaces or APIs for concurrent programming. There are many reasons for this.
The primary reason that so many modern concurrency frameworks have found their way into Scala is its inherent syntactic flexibility. Thanks to features such as first-class functions, byname parameters, type inference, and pattern matching explained in the following sections, it is possible to define APIs that look as if they are built-in language features.
Such APIs emulate various programming models as embedded domain-specific languages, with Scala serving as a host language: actors, software transactional memory, and futures are examples of APIs that look like they are basic language features when they are in fact implemented as libraries. On one hand, Scala avoids the need for developing a new language for each new concurrent programming model and serves as a rich nesting ground for modern concurrency frameworks. On the other hand, lifting the syntactic burden present in many other languages attracts more users.
The second reason Scala has pushed ahead lies in the fact that it is a safe language. Automatic garbage collection, automatic bound checks, and the lack of pointer arithmetic helps to avoid problems such as memory leaks, buffer overflows, and other memory errors. Similarly, static type safety eliminates a lot of programming errors at an early stage. When it comes to concurrent programming, which is in itself prone to various kinds of concurrency errors, having one less thing to worry about can make a world of difference.
The third important reason is interoperability. Scala programs are compiled into Java bytecode, so the resulting executable code runs on top of the Java Virtual Machine (JVM). This means that Scala programs can seamlessly use existing Java libraries, and interact with Java's rich ecosystem. Often, transitioning to a different language is a painful process. In the case of Scala, a transition from a language such as Java can proceed gradually and is much easier. This is one of the reasons for its growing adoption, and also a reason why some Java-compatible frameworks choose Scala as their implementation language.
Importantly, the fact that Scala runs on the JVM implies that Scala programs are portable across a range of different platforms. Not only that, but the JVM has well-defined threading and memory models, which are guaranteed to work in the same way on different computers. While portability is important for the consistent semantics of sequential programs, it is even more important when it comes to concurrent computing.
Having seen some of Scala's advantages for concurrent programming, we are now ready to study the language features relevant for this book.
At the time of writing, the next planned release of the language is Scala 2.12. From the user and API perspective, Scala 2.12 does not introduce new ground-breaking features. The goal of the 2.12 release is to improve code optimization and make Scala compliant with the Java 8 runtime. Since Scala's primary target is the Java runtime, making Scala compliant with Java 8 runtime will reduce the size of compiled programs and JAR files, better performance and faster compilation. From the user perspective, the major change is that you will have to install the JDK 8 framework instead of JDK 7.
The particular changes in Scala 2.12 worth mentioning are the following:
In this chapter, we studied what concurrent programming is and why Scala is a good language for concurrency. We gave a brief overview of what you will learn in this book, and how the book is organized. Finally, we stated some Scala preliminaries necessary for understanding the various concurrency topics in the subsequent chapters. If you would like to learn more about sequential Scala programming, we suggest that you read the book, Programming in Scala, Martin Odersky, Lex Spoon, and Bill Venners, Artima Inc.
In the next chapter, we will start with the fundamentals of concurrent programming on the JVM. We will introduce the basic concepts in concurrent programming, present the low-level concurrency utilities available on the JVM, and learn about the Java Memory Model.
"All non-trivial abstractions, to some degree, are leaky."
--Jeff AtwoodSince its inception, Scala has run primarily on top of JVM, and this fact has driven the design of many of its concurrency libraries. The memory model in Scala, its multithreading capabilities, and its inter-thread synchronization are all inherited from the JVM. Most, if not all, higher-level Scala concurrency constructs are implemented in terms of the low-level primitives presented in this chapter. These primitives are the basic way to deal with concurrency-in a way, the APIs and synchronization primitives in this chapter constitute the assembly of concurrent programming on the JVM.
In most cases, you should avoid low-level concurrency in place of higher-level constructs introduced later, but we felt it was important for you to understand what a thread is, that a guarded block is better than busy-waiting, or why a memory model is useful. We are convinced that this is essential for a better understanding of high-level concurrency abstractions. Despite the popular notion that an abstraction that requires knowledge about its implementation is broken, understanding the basics often proves very handy- in practice, all abstractions are to some extent leaky.
