Haskell High Performance Programming - Samuli Thomasson - E-Book

Haskell High Performance Programming E-Book

Samuli Thomasson

0,0
41,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Boost the performance of your Haskell applications using optimization, concurrency, and parallel programming

About This Book

  • Explore the benefits of lazy evaluation, compiler features, and tools and libraries designed for high performance
  • Write fast programs at extremely high levels of abstraction
  • Work through practical examples that will help you address the challenges of writing efficient code

Who This Book Is For

To get the most out of this book, you need to have a working knowledge of reading and writing basic Haskell. No knowledge of performance, optimization, or concurrency is required.

What You Will Learn

  • Program idiomatic Haskell that's also surprisingly efficient
  • Improve performance of your code with data parallelism, inlining, and strictness annotations
  • Profile your programs to identify space leaks and missed opportunities for optimization
  • Find out how to choose the most efficient data and control structures
  • Optimize the Glasgow Haskell Compiler and runtime system for specific programs
  • See how to smoothly drop to lower abstractions wherever necessary
  • Execute programming for the GPU with Accelerate
  • Implement programming to easily scale to the cloud with Cloud Haskell

In Detail

Haskell, with its power to optimize the code and its high performance, is a natural candidate for high performance programming. It is especially well suited to stacking abstractions high with a relatively low performance cost. This book addresses the challenges of writing efficient code with lazy evaluation and techniques often used to optimize the performance of Haskell programs.

We open with an in-depth look at the evaluation of Haskell expressions and discuss optimization and benchmarking. You will learn to use parallelism and we'll explore the concept of streaming. We'll demonstrate the benefits of running multithreaded and concurrent applications. Next we'll guide you through various profiling tools that will help you identify performance issues in your program. We'll end our journey by looking at GPGPU, Cloud and Functional Reactive Programming in Haskell. At the very end there is a catalogue of robust library recommendations with code samples.

By the end of the book, you will be able to boost the performance of any app and prepare it to stand up to real-world punishment.

Style and approach

This easy-to-follow guide teaches new practices and techniques to optimize your code, and then moves towards more advanced ways to effectively write efficient Haskell code. Small and simple practical examples will help you test the concepts yourself, and you will be able to easily adapt them for any application.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 563

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Haskell High Performance Programming
Credits
About the Author
About the Reviewer
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Identifying Bottlenecks
Meeting lazy evaluation
Writing sum correctly
Weak head normal form
Folding correctly
Memoization and CAFs
Constant applicative form
Recursion and accumulators
The worker/wrapper idiom
Guarded recursion
Accumulator parameters
Inspecting time and space usage
Increasing sharing and minimizing allocation
Compiler code optimizations
Inlining and stream fusion
Polymorphism performance
Partial functions
Summary
2. Choosing the Correct Data Structures
Annotating strictness and unpacking datatype fields
Unbox with UNPACK
Using anonymous tuples
Performance of GADTs and branching
Handling numerical data
Handling binary and textual data
Representing bit arrays
Handling bytes and blobs of bytes
Working with characters and strings
Using the text library
Builders for iterative construction
Builders for strings
Handling sequential data
Using difference lists
Difference list performance
Difference list with the Writer monad
Using zippers
Accessing both ends fast with Seq
Handling tabular data
Using the vector package
Handling sparse data
Using the containers package
Using the unordered-containers package
Ephemeral data structures
Mutable references are slow
Using mutable arrays
Using mutable vectors
Bubble sort with vectors
Working with monads and monad stacks
The list monad and its transformer
Free monads
Working with monad transformers
Speedup via continuation-passing style
Summary
3. Profile and Benchmark to Your Heart's Content
Profiling time and allocations
Setting cost centres manually
Setting cost centres automatically
Installing libraries with profiling
Debugging unexpected crashes with profiler
Heap profiling
Cost centre-based heap profiling
Objects outside the heap
Retainer profiling
Biographical profiling
Benchmarking using the criterion library
Profile and monitor in real time
Monitoring over HTTP with ekg
Summary
4. The Devil's in the Detail
The anatomy of a Haskell project
Useful fields and flags in cabal files
Test suites and benchmarks
Using the stack tool
Multi-package projects
Erroring and handling exceptions
Handling synchronous errors
The exception hierarchy
Handling asynchronous errors
Throw and catch in other monads besides IO
Writing tests for Haskell
Property checks
Unit testing with HUnit
Test frameworks
Trivia at term-level
Coding in GHC PrimOps
Control inlining
Using rewrite rules
Specializing definitions
Phase control
Trivia at type-level
Phantom types
Functional dependencies
Type families and associated types
Useful GHC extensions
Monomorphism Restriction
Extensions for patterns and guards
Strict-by-default Haskell
Summary
5. Parallelize for Performance
Primitive parallelism and the Runtime System
Spark away
Subtle evaluation – pseq
When in doubt, use the force
The Eval monad and strategies
Composing strategies
Fine-tune granularity with chunking and buffering
The Par monad and schedules
spawn for futures and promises
Non-deterministic parallelism with ParIO
Diagnosing parallelism – ThreadScope
Data parallel programming – Repa
Playing with Repa in GHCi
Mapping and delayed arrays
Reduction via folding
Manifest representations
Delayed representation and fusion
Indices, slicing, and extending arrays
Convolution with stencils
Cursored and partitioned arrays
Writing fast Repa code
Additional libraries
Example from image processing
Loading the image from file
Identifying letters with convolution
Extracting strings from an image
Testing and evaluating performance
Summary
6. I/O and Streaming
Reading, writing, and handling resources
Traps of lazy I/O
File handles, buffering, and encoding
Binary I/O
Textual I/O
I/O performance with filesystem objects
Sockets and networking
Acting as a TCP/IP client
Acting as a TCP server (Unix domain sockets)
Raw UDP traffic
Networking above the transport layer
Managing resources with ResourceT
Streaming with side-effects
Choosing a streaming library
Simple streaming using io-streams
Creating input streams
Using combinators and output streams
Handling exceptions and resources in streams
An example of parsing using io-streams and attoparsec
Streaming using pipes
Composing and executing pipes
For loops and category theory in pipes
Handling exceptions in pipes
Strengths and weaknesses of pipes
Streaming using conduits
Handling resources and exceptions in conduits
Resuming conduits
Logging in Haskell
Logging with FastLogger
More abstract loggers
Timed log messages
Monadic logging
Customizing monadic loggers
Summary
7. Concurrency and Performance
Threads and concurrency primitives
Threads and mutable references
Avoid accumulating thunks
Atomic operations with IORefs
MVar
MVars are fair
MVar as a building block
Broadcasting with Chan
Software Transactional Memory
STM example – Bank accounts
Alternative transactions
Exceptions in STM
Runtime System and threads
Masking asynchronous exceptions
Asynchronous processing
Using the Async API
Async example – Timeouts
Composing with Concurrently
Lifting up from I/O
Top-level mutable references
Lifting from a base monad
Lifting base with exception handling
Summary
8. Tweaking the Compiler and Runtime System (GHC)
Using GHC like a pro
Operating GHC
Circular dependencies
Adjusting optimizations and transformations
The state hack
Floating lets in and out
Eliminating common subexpressions
Liberate-case duplicates code
Compiling via the LLVM route
Linking and building shared libraries
Preprocessing Haskell source code
Enforcing type-safety using Safe Haskell
Tuning GHC's Runtime System
Scheduler and green threads
Sparks and spark pool
Bounded threads and affinity
Indefinite blocking and weak references
Heap, stack, and memory management
Evaluation stack in Haskell
Tuning the garbage collector
Parallel GC
Profiling and tracing options
Tracing using eventlog
Options for profiling and debugging
Summary of useful GHC options
Basic usage
The LLVM backend
Turn optimizations on and off
Configuring the Runtime System (compile-time)
Safe Haskell
Summary of useful RTS options
Scheduler flags
Memory management
Garbage collection
Runtime System statistics
Profiling and debugging
Summary
9. GHC Internals and Code Generation
Interpreting GHC's internal representations
Reading GHC Core
Spineless tagless G-machine
Primitive GHC-specific features
Kinds encode type representation
Datatype generic programming
Working example – A generic sum
Generating Haskell with Haskell
Splicing with $(…)
Names in templates
Smart template constructors
The constN function
Lifting Haskell code to Q with quotation brackets
Launching missiles during compilation
Reifying Haskell data into template objects
Deriving setters with Template Haskell
Quasi-quoting for DSLs
Summary
10. Foreign Function Interface
From Haskell to C and C to Haskell
Common types in Haskell and C
Importing static functions and addresses
Exporting Haskell functions
Compiling a shared library
Function pointers and wrappers
Haskell callbacks from C
Data marshal and stable pointers
Allocating memory outside the heap
Pointing to objects in the heap
Marshalling abstract datatypes
Marshalling in standard libraries
Summary
11. Programming for the GPU with Accelerate
Writing Accelerate programs
Kernels – The motivation behind explicit use and run
Working with elements and scalars
Rudimentary array computations
Example – Matrix multiplication
Flow control and conditional execution
Inspecting generated code
Running with the CUDA backend
Debugging CUDA programs
More Accelerate concepts
Working with tuples
Folding, reducing, and segmenting
Accelerated stencils
Permutations in Accelerate
Using the backend foreign function interface
Summary
12. Scaling to the Cloud with Cloud Haskell
Processes and message-passing
Creating a message type
Creating a Process
Spawning and closures
Running with the SimpleLocalNet backend
Using channels
Establishing bidirectional channels
Calling a remote process
Handling failure
Firing up monitors
Matching on the message queue
Linking processes together
Message-passing performance
Nodes and networking
Summary
13. Functional Reactive Programming
The tiny discrete-time Elerea
Mutually recursive signals
Signalling side-effects
Dynamically changing signal networks
Performance and limitations in Elerea
Events and signal functions with Yampa
Adding state to signal functions
Working with time
Switching and discrete-time events
Integrating to the real world
Reactive-banana – Safe and simple semantics
Example – First GUI application
Graphical display with wxWidgets
Combining events and behaviors
Switching events and behaviors
Observing moments on demand
Recursion and semantics
Adding input and output
Input via polling or handlers
Reactimate output
Input and output dynamically
Summary
14. Library Recommendations
Representing data
Functional graphs
Numeric data for special use
Encoding and serialization
Binary serialization of Haskell values
Encoding to and from other formats
CSV input and output
Persistent storage, SQL, and NoSQL
acid-state and safecopy
persistent and esqueleto
HDBC and add-ons
Networking and HTTP
HTTP clients and servers
Supplementary HTTP libraries
JSON remote procedure calls
Using WebSockets
Programming a REST API
Cryptography
Web technologies
Parsing and pretty-printing
Regular expressions in Haskell
Parsing XML
Pretty-printing and text formatting
Control and utility libraries
Using lenses
Easily converting between types (convertible)
Using a custom Prelude
Working with monads and transformers
Monad morphisms – monad-unlift
Handling exceptions
Random number generators
Parallel and concurrent programming
Functional Reactive Programming
Mathematics, statistics, and science
Tools for research and sketching
The HaskellR project
Creating charts and diagrams
Scripting and CLI applications
Testing and benchmarking
Summary
Index

Haskell High Performance Programming

Haskell High Performance Programming

Copyright © 2016 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September 2016

Production reference: 1190916

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78646-421-7

www.packtpub.com

Credits

Author

Samuli Thomasson

Reviewer

Aaron Stevens

Commissioning Editor

Kunal Parikh

Acquisition Editor

Sonali Vernekar

Content Development Editor

Priyanka Mehta

Technical Editor

Ravikiran Pise

Copy Editor

Safis Editing

Project Coordinator

Izzat Contractor

Proofreader

Safis Editing

Indexer

Tejal Daruwale Soni

Graphics

Abhinash Sahu

Production Coordinator

Melwyn Dsa

Cover Work

Melwyn Dsa

About the Author

Samuli Thomasson is a long-time functional programming enthusiast from Finland who has used Haskell extensively, both as a pastime and commercially, for over four years. He enjoys working with great tools that help in getting things done nice and fast.

His current job at RELEX Solutions consists of providing technical solutions to a variety of practical problems. Besides functional programming, Samuli is interested in distributed systems, which he also studies at the University of Helsinki.

I am grateful to my awesome friends, who have stuck around and provided their support during the writing process, and my family for always being there and their understanding.

About the Reviewer

Aaron Stevens is a scientific software engineer with Molex LLC in Little Rock, Arkansas, where he combines his passion for programming with his education in electrical systems engineering to develop innovative techniques to characterize high-speed electronics in the lab and in production. He specializes in signal processing, statistical process-control methods, and application construction in Python and C#, and he enjoys discovering new methods to explore complex data sets through rich visualizations.

Away from the office, Aaron enjoys practicing with a variety of programming languages, studying linguistics, cooking, and spending time with his family. He received his BS in mathematics and BS in electrical systems engineering from the University of Arkansas in Little Rock.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Preface

Haskell is an elegant language. It allows us to express in code exactly what we mean, in a clean and compact style. The nice features, including referential transparency and call-by-need evaluation, not only help the programmer be more efficient, but also help Haskell compilers to optimize programs in ways that are otherwise plain impossible. For example, the garbage collector of GHC is notoriously fast, not least thanks to its ability to exploit the immutability of Haskell values.

Unfortunately, high expressivity is a double-edged sword. Reasoning the exact order of evaluation in Haskell programs is, in general, not an easy task. A lack of understanding of the lazy call-by-need evaluation in Haskell will for sure lead the programmer to introduce space leaks sooner or later. A productive Haskell programmer not only has to know how to read and write the language, which is a hard enough skill to achieve in itself, they also need to understand a new evaluation schema and some related details. Of course, in order to not make things too easy, just knowing the language well will not get you very far. In addition, one has to be familiar with at least a few common libraries and, of course, the application domain itself.

This book will give you working knowledge of high-performance Haskell programming, including parallelism and concurrency. In this book, we will cover the language, GHC, and the common libraries of Haskell.

What this book covers

Chapter 1, Identifying Bottlenecks, introduces you to basic techniques for optimal evaluation and avoiding space leaks.

Chapter 2, Choose the Correct Data Structures, works with and optimizes both immutable and mutable data structures.

Chapter 3, Profile and Benchmark to Your Heart's Content, profiles Haskell programs using GHC and benchmarking using Criterion.

Chapter 4, The Devil's in the Detail, explains the small details that affect performance in Haskell programs, including code sharing, specializing, and simplifier rules.

Chapter 5, Parallelize for Performance, exploits parallelism in Haskell programs using the RePa library for data parallelism.

Chapter 6, I/O and Streaming, talks about the pros and cons of lazy and strict I/O in Haskell and explores the concept of streaming.

Chapter 7, Concurrency Performance, explores the different aspects of concurrent programming, such as shared variables, exception handling, and software-transactional memory.

Chapter 8, Tweaking the Compiler and Runtime System, chooses the optimal compiler and runtime parameters for Haskell programs compiled with GHC.

Chapter 9, GHC Internals and Code Optimizations, delves deeper into the compilation pipeline, and understands the intermediate representations of GHC.

Chapter 10, Foreign Function Interface, calls safely to and from C in Haskell using GHC and its FFI support.

Chapter 11, Programming for the GPU with Accelerate, uses the Accelerate library to program backend-agnostic GPU programs and executes on CUDA-enabled systems.

Chapter 12, Scaling to the Cloud with Cloud Haskell, uses the Cloud Haskell ecosystem to build distributed systems with Haskell.

Chapter 13, Functional Reactive Programming, introduces three Haskell FRP libraries, including Elerea, Yampa, and Reactive-banana.

Chapter 14, Library Recommendations, talks about a catalogue of robust Haskell libraries, accompanied with overviews and examples.

What you need for this book

To run most examples in this book, all you need is a working, relatively recent, installation of GHC and some Haskell libraries. Examples are built for nix-like systems, although they are easily adapted for a Windows machine.

The recommended minimum version for GHC is 7.6. The Haskell libraries needed are introduced in the chapters in which they are used. In Chapter 4, The Devil's in the Detail, we use the Haskell Stack tool to perform some tasks, but it isn't strictly required, although it is recommended to install Stack.

In Chapter 11, Programming for the GPU Using Accelerate, executing the CUDA versions of examples requires a CUDA-enabled system and the installation of the CUDA platform.

Who this book is for

To get the most out of this book, you need to have a working knowledge of reading and writing basic Haskell. No knowledge of performance, optimization, or concurrency is required.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Haskell-High-Performance-Programming. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from http://www.packtpub.com/sites/default/files/downloads/HaskellHighPerformanceProgramming_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Summary

In this chapter, we learned how lazy evaluation works, what weak head normal form is, and how to control it by increasing strictness with different methods. We considered the peculiarities of right-fold, left-fold, and strict left-fold, and in which situations one fold strategy works better than another. We introduced the concept of CAF along with memoization techniques, utilized the worker/wrapper pattern, and used guarded recursion to write clean and efficient recursive programs.

We used the :sprint command in GHCi to inspect unevaluated thunks and the Runtime System option -s to inspect the heap usage and GC activity of compiled programs. We took a look at inlining, stream fusion, and the performance costs of partial functions and polymorphism.

In the next chapter, we will take a look at other basic data and control structures, such as different array structures and some monads. But first, we will learn about the performance semantics of Haskell data types and related common optimization techniques.