43,19 €
Understand the philosophy of the Clojure language and dive into its inner workings to unlock its advanced features, methodologies, and constructs
If you're looking to learn more about the core libraries and dive deep into the Clojure language, then this book is ideal for you. Prior knowledge of the Clojure language is required.
Clojure is a general-purpose language from the Lisp family with an emphasis on functional programming. It has some interesting concepts and features such as immutability, gradual typing, thread-safe concurrency primitives, and macro-based metaprogramming, which makes it a great choice to create modern, performant, and scalable applications.
Mastering Clojure gives you an insight into the nitty-gritty details and more advanced features of the Clojure programming language to create more scalable, maintainable, and elegant applications. You'll start off by learning the details of sequences, concurrency primitives, and macros. Packed with a lot of examples, you'll get a walkthrough on orchestrating concurrency and parallelism, which will help you understand Clojure reducers, and we'll walk through composing transducers so you know about functional composition and process transformation inside out. We also explain how reducers and transducers can be used to handle data in a more performant manner.
Later on, we describe how Clojure also supports other programming paradigms such as pure functional programming and logic programming. Furthermore, you'll level up your skills by taking advantage of Clojure's powerful macro system. Parallel, asynchronous, and reactive programming techniques are also described in detail.
Lastly, we'll show you how to test and troubleshoot your code to speed up your development cycles and allow you to deploy the code faster.
This is an easy-to-follow project-based guide that throws you directly into the excitement of Clojure code. Mastering Clojure is for anyone who is interested in expanding their knowledge of language features and advanced functional programming.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 389
Veröffentlichungsjahr: 2016
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: March 2016
Production reference: 1180316
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78588-974-5
www.packtpub.com
Author
Akhil Wali
Reviewer
Matt Revelle
Commissioning Editor
Neil Alexander
Acquisition Editor
Aaron Lazar
Content Development Editor
Aishwarya Pandere
Technical Editor
Tanmayee Patil
Copy Editor
Merilyn Pereira
Project Coordinator
Nidhi Joshi
Proofreader
Safis Editing
Indexer
Rekha Nair
Graphics
Jason Monteiro
Production Coordinator
Melwyn Dsa
Cover Work
Melwyn D'sa
Akhil Wali is a software developer. He has been writing code as a hobbyist since 1997 and professionally since 2010. He completed his post graduation from Santa Clara University in 2010, and he graduated from Visvesvaraya Technological University in 2008. His areas of work include business intelligence systems, ERP systems, search engines, and document collaboration tools. He mostly works with Clojure, JavaScript, and C#. Apart from computers, his interests include soccer, guitar solos, and finding out more about the universe.
I would like to thank two important women in my life for supporting me and inspiring me to write this book—my mother Renuka and my wife Megha. I also thank Matt Revelle and the Clojure community for their fantastic input and ideas.
Matt Revelle is a doctoral candidate in computer science at George Mason University, where he works on machine learning and social network dynamics. He started using Clojure in 2008 and it continues to be his preferred language for many projects and his daily work.
I would like to thank Akhil Wali and Nidhi Joshi for helping me to understand the purpose of this book and encouraging me to provide feedback in a timely fashion.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Ever since the dawn of computers decades ago, there have been a number of programming languages created for the purpose of writing software. One of the earliest of these languages is Lisp, whose name is an abbreviation of the term "list processing". Lisp has evolved greatly over time, and there are now several dialects of Lisp. Each of these dialects emphasizes its own set of ideas and features. Clojure is one among these Lisps, and it focuses on immutability, concurrency, and parallelism. It emphasizes being simple, practical, and intuitive, which makes it easy to learn. It is said that you have never realized how a language can be powerful until you have programmed in a Lisp, and Clojure is no exception to this rule. A skilled Clojure programmer can easily and quickly create software that is both performant and scalable.
With the recent rise of parallel data processing and multicore architectures, functional programming languages have become more popular for creating software that is both provable and performant. Clojure brings functional programming to the Java Virtual Machine (JVM), and also to web browsers through ClojureScript. Like other functional programming languages, Clojure focuses on the use of functions and immutable data structures for writing programs. Clojure also adds a hint of Lisp through the use of symbolic expressions and a dynamic type system.
This book will walk you through the interesting features of the Clojure language. We will also discuss some of the more advanced and lesser known programming constructs in Clojure. Several libraries from the Clojure ecosystem that we can put to practical use in our own programs will also be described. You won't need to be convinced any more about the elegance and power of the Clojure language by the time you've finished this book.
This book wouldn't have materialized without the feedback from the technical reviewers and the effort of the content and editing teams at Packt Publishing.
Chapter 1, Working with Sequences and Patterns, describes several elementary programming techniques, such as recursion, sequences, and pattern matching.
Chapter 2, Orchestrating Concurrency and Parallelism, explains the various constructs available in the Clojure language for concurrent and parallel programming.
Chapter 3, Parallelization Using Reducers, introduces reducers, which are abstractions of collection types for parallel data processing.
Chapter 4, Metaprogramming with Macros, explains how we can use macros and quoting to implement our own programming constructs in Clojure.
Chapter 5, Composing Transducers, describes how we can define and compose data transformations using transducers.
Chapter 6, Exploring Category Theory, explores algebraic data structures, such as functors, monoids, and monads, from the pure functional programming world.
Chapter 7, Programming with Logic, describes how we can use logical relations to solve problems.
Chapter 8, Leveraging Asynchronous Tasks, explains how we can write code that is executed asynchronously.
Chapter 9, Reactive Programming, describes how we can implement solutions to problems using asynchronous event streams.
Chapter 10, Testing Your Code, covers several testing libraries that are useful in verifying our code. This chapter describes techniques such as test-driven development, behavior-driven development, and generative testing.
Chapter 11, Troubleshooting and Best Practices, describes techniques to debug your code as well as several good practices for developing Clojure applications and libraries.
One of the pieces of software required for this book is the Java Development Kit (7 or above), which you can obtain from http://www.oracle.com/technetwork/java/javase/downloads/. JDK is necessary to run and develop applications on the Java platform. The other major software that you'll need is Leiningen (2.5.1 or above), which you can download and install from http://github.com/technomancy/leiningen.
Leiningen is a tool for managing Clojure projects and their dependencies. Throughout this book, we'll use a number of Clojure libraries. Leiningen will download these libraries, and also the Clojure language itself, for us, as required.
You'll also need a text editor or an integrated development environment (IDE). If you already have a text editor that you prefer, you can probably use it. Navigate to http://dev.clojure.org/display/doc/Getting+Started for a list of environment-specific plugins to write code in Clojure. If you don't have a preference, it is suggested that you use Eclipse with Counterclockwise (http://doc.ccw-ide.org/) or Light Table (http://lighttable.com/).
Some examples in this book will also require a web browser, such as Chrome (42 or above), Firefox (38 or above), or Microsoft Internet Explorer (9 or above).
This book is for programmers or software architects who are familiar with Clojure and want to learn about the language's features in detail. It is also for readers who are eager to explore popular and practical Clojure libraries.
This book does not describe the syntax of the Clojure language. You are expected to be familiar with the language, but you need not be a Clojure expert. You are also expected to know how functions are used and defined in Clojure, and have some basic knowledge about Clojure data structures such as strings, lists, vectors, maps, and sets. You must also be able to compile ClojureScript programs to JavaScript and run them in an HTML page.
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Hence, for trees that are represented as sequences, we should use the seq-zip function instead."
A block of code is set as follows:
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
Any command-line input or output is written as follows:
Another simple convention that we use is to always show the Clojure code that's entered in the REPL (read-evaluate-print-loop) starting with theuser>prompt. In practice, this prompt will change depending on the Clojure namespace that we are currently using. However, for simplicity, code in the REPL always starts with the user> prompt in this book, as follows:
For convenience, the REPL output in this book is pretty-printed (using the clojure.pprint/pprint function). Objects that are printed in the REPL output are enclosed within the#<and>symbols. We must note that the output of the timeform in your own REPL may not completely match the output shown in the code examples of this book. Rather, the use of timeforms is meant to give you an idea of the scale of the time taken to execute a given expression. Similarly, the output of the code examples that use the rand-int function may not exactly match the output in your REPL.
Some examples in this book use ClojureScript, and the files for these examples will have a .cljs extension. Also, all macros used in these examples will have to be explicitly included using the :require-macros clause of the ns form. The HTML and CSS files associated with the ClojureScript examples in this book will not be shown in this book, but can always be found in the book's code bundle.
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
A sequence, shortened as a seq, is essentially an abstraction of a list. This abstraction provides a unified model or interface to interact with a collection of items. In Clojure, all the primitive data structures, namely strings, lists, vectors, maps, and sets can be treated as sequences. In practice, almost everything that involves iteration can be translated into a sequence of computations. A collection is termed as seqable if it implements the abstraction of a sequence. We will learn everything there is to know about sequences in this section.
Sequences can also be lazy. A lazy sequence can be thought of as a possibly infinite series of computed values. The computation of each value is deferred until it is actually needed. We should note that the computation of a recursive function can easily be represented as a lazy sequence. For example, the Fibonacci sequence can be computed by lazily adding the last two elements in the previously computed sequence. This can be implemented as shown in Example 1.7.
The following examples can be found in src/m_clj/c1/seq.clj of the book's source code.
Example 1.7: A lazy Fibonacci sequence
The threading macro ->> is used to pass the result of a given expression as the last argument to the next expression, in a repetitive manner for all expressions in its body. Similarly, the threading macro -> is used to pass the result of a given expression as the first argument to the subsequent expressions.
The fibo-lazy function from Example 1.7 uses the iterate, map, and take functions to create a lazy sequence. We will study these functions in more detail later in this section. The fibo-lazy function takes a single argument n, which indicates the number of items to be returned by the function. In the fibo-lazy function, the values 0N and 1N are passed as a vector to the iterate function, which produces a lazy sequence. The function used for this iteration creates a new pair of values b and (+ a b) from the initial values a and b.
Next, the map function applies the first function to obtain the first element in each resulting vector. A take form is finally applied to the sequence returned by the map function to retrieve the first n values in the sequence. The fibo-lazy function does not cause any error even when passed relatively large values of n, shown as follows:
Interestingly, the fibo-lazy function in Example 1.7 performs significantly better than the recursive functions from Example 1.2 and Example 1.3, as shown here:
Also, binding the value returned by the fibo-lazy function to a variable does not really consume any time. This is because this returned value is lazy and not evaluated yet. Also, the type of the return value is clojure.lang.LazySeq, as shown here:
We can optimize the fibo-lazy function even further by using memoization, which essentially caches the value returned by a function for a given set of inputs. This can be done using the memoize function, as follows:
The fibo-mem function is a memoized version of the fibo-lazy function. Hence, subsequent calls to the fibo-mem function for the same set of inputs will return values significantly faster, shown as follows:
Note that the memoize function can be applied to any function, and it is not really related to sequences. The function we pass to memoize must be free of side effects, or else any side effects will be invoked only the first time the memoized function is called with a given set of inputs.
Sequences are a truly ubiquitous abstraction in Clojure. The primary motivation behind using sequences is that any domain with sequence-like data in it can be easily modelled using the standard functions that operate on sequences. This infamous quote from the Lisp world reflects on this design:
"It is better to have 100 functions operate on one data abstraction than 10 functions on 10 data structures."
A sequence can be constructed using the cons function. We must provide an element and another sequence as arguments to the cons function. The first function is used to access the first element in a sequence, and similarly the rest function is used to obtain the other elements in the sequence, shown as follows:
The first and rest functions in Clojure are equivalent to the car and cdr functions, respectively, from traditional Lisps. The cons function carries on its traditional name.
In Clojure, an empty list is represented by the literal (). An empty list is considered as a truthy value, anddoes not equate to nil. This rule is true for any empty collection. An empty list does indeed have a type – it's a list. On the other hand, the nil literal signifies the absence of a value, of any type, and is not a truthy value. The second argument that is passed to cons could be empty, in which case the resulting sequence would contain a single element:
An interesting quirk is that nil can be treated as an empty collection, but the converse is not true. We can use the empty? and nil? functions to test for an empty collection and a nil value, respectively. Note that (empty? nil) returns true, shown as follows:
By the truthy value, we mean to say a value that will test positive in a conditional expression such as an if or a when form.
The rest function will return an empty list when supplied an empty list. Thus, the value returned by rest is always truthy. The seq function can be used to obtain a sequence from a given collection. It will return nil for an empty list or collection. Hence, the head, rest and seq functions can be used to iterate over a sequence. The next function can also be used for iteration, and the expression (seq (rest coll)) is equivalent to (next coll), shown as follows:
The sequence function can be used to create a list from a sequence. For example, nil can be converted into an empty list using the expression (sequence nil). In Clojure, the seq? function is used to check whether a value implements the sequence interface, namely clojure.lang.ISeq. Only lists implement this interface, and other data structures such as vectors, sets, and maps have to be converted into a sequence by using the seq function. Hence, seq? will return true only for lists. Note that the list?, vector?, map?, and set? functions can be used to check the concrete type of a given collection. The behavior of the seq? function with lists and vectors can be described as follows:
Only lists and vectors provide a guarantee of sequential ordering among elements. In other words, lists and vectors will store their elements in the same order or sequence as they were created. This is in contrast to maps and sets, which can reorder their elements as needed. We can use the sequential? function to check whether a collection provides sequential ordering:
The associative? function can be used to determine whether a collection or sequence associates a key with a particular value. Note that this function returns true only for maps and vectors:
The behavior of the associative? function is fairly obvious for a map since a map is essentially a collection of key-value pairs. The fact that a vector is also associative is well justified too, as a vector has an implicit key for a given element, namely the index of the element in the vector. For example, the [:a :b] vector has two implicit keys, 0 and 1, for the elements :a and :b respectively. This brings us to an interesting consequence – vectors and maps can be treated as functions that take a single argument, that is a key, and return an associated value, shown as follows:
Although they are not associative by nature, sets are also functions. Sets return a value contained in them, or nil, depending on the argument passed to them, shown as follows:
Now that we have familiarized ourselves with the basics of sequences, let's have a look at the many functions that operate over sequences.
There are several ways to create sequences other than using the cons function. We have already encountered the conj function in the earlier examples of this chapter. The conj function takes a collection as its first argument, followed by any number of arguments to add to the collection. We must note that conj behaves differently for lists and vectors. When supplied a list, the conj function adds the other arguments at the head, or start, of the list. In case of a vector, the conj function will insert the other arguments at the tail, or end, of the vector:
The concat function can be used to join or concatenate any number of sequences in the order in which they are supplied, shown as follows:
A given sequence can be reversed using the reverse function, shown as follows:
The range function can be used to generate a sequence of values within a given integer range. The most general form of the range function takes three arguments—the first argument is the start of the range, the second argument is the end of the range, and the third argument is the step of the range. The step of the range defaults to 1, and the start of the range defaults to 0, as shown here:
We must note that the range function expects the start of the range to be less than the end of the range. If the start of the range is greater than the end of the range and the step of the range is positive, the range function will return an empty list. For example, (range 15 10) will return (). Also, the range function can be called with no arguments, in which case it returns a lazy and infinite sequence starting at 0.
The take and drop functions can be used to take or drop elements in a sequence. Both functions take two arguments, representing the number of elements to take or drop from a sequence, and the sequence itself, as follows:
To obtain an item at a particular position in the sequence, we should use the nth function. This function takes a sequence as its first argument, followed by the position of the item to be retrieved from the sequence as the second argument:
To repeat a given value, we can use the repeat function. This function takes two arguments and repeats the second argument the number of times indicated by the first argument:
The repeat function will evaluate the expression of the second argument and repeat it. To call a function a number of times, we can use the repeatedly function, as follows:
In this example, the repeat form first evaluates the (rand-int 100) form, before repeating it. Hence, a single value will be repeated several times. Note that the rand-int function simply returns a random integer between 0 and the supplied value. On the other hand, the repeatedly function invokes the supplied function a number of times, thus producing a new value every time the rand-int function is called.
A sequence can be repeated an infinite number of times using the cycle function. As you might have guessed, this function returns a lazy sequence to indicate an infinite series of values. The take function can be used to obtain a limited number of values from the resulting infinite sequence, shown as follows:
The interleave function can be used to combine any number of sequences. This function returns a sequence of the first item in each collection, followed by the second item, and so on. This combination of the supplied sequences is repeated until the shortest sequence is exhausted of values. Hence, we can easily combine a finite sequence with an infinite one to produce another finite sequence using the interleave function:
Another function that performs a similar operation is the interpose function. The interpose function inserts a given element between the adjacent elements of a given sequence:
The iterate function can also be used to create an infinite sequence. Note that we have already used the iterate function to create a lazy sequence in Example 1.7. This function takes a function f and an initial value x as its arguments. The value returned by the iterate function will have (f x) as the first element, (f (f x)) as the second element, and so on. We can use the iterate function with any other function that takes a single argument, as follows:
There are also several functions to convert sequences into different representations or values. One of the most versatile of such functions is the map function. This function maps a given function over a given sequence, that is, it applies the function to each element in the sequence. Also, the value returned by map is implicitly lazy. The function to be applied to each element must be the first argument to map, and the sequence on which the function must be applied is the next argument:
Note that map can accept any number of collections or sequences as its arguments. In this case, the resulting sequence is obtained by passing the first items of the sequences as arguments to the given function, and then passing the second items of the sequences to the given function, and so on until any of the supplied sequences are exhausted. For example, we can sum the corresponding elements of two sequences using the map and + functions, as shown here:
The mapv function has the same semantics of map, but returns a vector instead of a sequence, as shown here:
Another variant of the map function is the map-indexed function. This function expects that the supplied function will accept two arguments—one for the index of a given element and another for the actual element in the list:
In this example, the function supplied to map-indexed simply returns its arguments as a vector. An interesting point that we can observe from the preceding example is that a string can be treated as a sequence of characters.
The mapcat function is a combination of the map and concat function. This function maps a given function over a sequence, and applies the concat function on the resulting sequence:
In this example, we use the split function from the clojure.string namespace to split a string using a regular expression, shown as #"\d". The split function will return a vector of strings, and hence the mapcat function returns a sequence of strings instead of a sequence of vectors like the map function.
The reduce function is used to combine or reduce a sequence of items into a single value. The reduce function requires a function as its first argument and a sequence as its second argument. The function supplied to reduce must accept two arguments. The supplied function is first applied to the first two elements in the given sequence, and then applied to the previous result and the third element in the sequence, and so on until the sequence is exhausted. The reduce function also has a second arity, which accepts an initial value, and in this case, the supplied function is applied to the initial value and the first element in the sequence as the first step. The reduce function can be considered equivalent to loop-based iteration in imperative programming languages. For example, we can compute the sum of all elements in a sequence using reduce, as follows:
In this example, when the reduce function is supplied an empty collection, it returns 0, since (+) evaluates to 0. When an initial value of 1 is supplied to the reduce function, it returns 1, since (+ 1) returns 1.
A list comprehension can be created using the for macro. Note that a for form will be translated into an expression that uses the map function. The for macro needs to be supplied a vector of bindings to any number of collections, and an expression in the body. This macro binds the supplied symbol to each element in its corresponding collection and evaluates the body for each element. Note that the for macro also supports a :let clause to assign a value to a variable, and also a :when clause to filter out values:
The for macro can also be used over a number of collections, as shown here:
The doseq macro has semantics similar to that of for, except for the fact that it always returns a nil value. This macro simply evaluates the body expression for all of the items in the given bindings. This is useful in forcing evaluation of an expression with side effects for all the items in a given collection:
As shown in the preceding example, both the first and second doseq forms return nil. However, the second form prints the value of the expression (* x x), which is a side effect, for all items in the sequence (range 3 7).
The into function can be used to easily convert between types of collections. This function requires two collections to be supplied to it as arguments, and returns the first collection filled with all the items in the second collection. For example, we can convert a sequence of vectors into a map, and vice versa, using the into function, shown here:
We should note that the into function is essentially a composition of the reduce and conj functions. As conj is used to fill the first collection, the value returned by the into
