E-Book
73,19 €

Julia: High Performance Programming E-Book

Ivo Balbaert

0,0

73,19 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Leverage the power of Julia to design and develop high performing programs

About This Book

Get to know the best techniques to create blazingly fast programs with Julia
Stand out from the crowd by developing code that runs faster than your peers' code
Complete an extensive data science project through the entire cycle from ETL to analytics and data visualization

Who This Book Is For

This learning path is for data scientists and for all those who work in technical and scientific computation projects. It will be great for Julia developers who are interested in high-performance technical computing.

This learning path assumes that you already have some basic working knowledge of Julia's syntax and high-level dynamic languages such as MATLAB, R, Python, or Ruby.

What You Will Learn

Set up your Julia environment to achieve the highest productivity
Solve your tasks in a high-level dynamic language and use types for your data only when needed
Apply Julia to tackle problems concurrently and in a distributed environment
Get a sense of the possibilities and limitations of Julia's performance
Use Julia arrays to write high performance code
Build a data science project through the entire cycle of ETL, analytics, and data visualization
Display graphics and visualizations to carry out modeling and simulation in Julia
Develop your own packages and contribute to the Julia Community

In Detail

In this learning path, you will learn to use an interesting and dynamic programming language—Julia! You will get a chance to tackle your numerical and data problems with Julia. You'll begin the journey by setting up a running Julia platform before exploring its various built-in types. We'll then move on to the various functions and constructs in Julia. We'll walk through the two important collection types—arrays and matrices in Julia.

You will dive into how Julia uses type information to achieve its performance goals, and how to use multiple dispatch to help the compiler emit high performance machine code. You will see how Julia's design makes code fast, and you'll see its distributed computing capabilities.

By the end of this learning path, you will see how data works using simple statistics and analytics, and you'll discover its high and dynamic performance—its real strength, which makes it particularly useful in highly intensive computing tasks.

This learning path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:

Getting Started with Julia by Ivo Balvaert
Julia High Performance by Avik Sengupta
Mastering Julia by Malcolm Sherrington

Style and approach

This hands-on manual will give you great explanations of the important concepts related to Julia programming.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 865

Veröffentlichungsjahr: 2016

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Julia: High Performance Programming

Credits

Preface

What this learning path covers

What you need for this learning path

Who this learning path is for

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

I. Module 1

The Rationale for Julia

The scope of Julia

Julia's place among the other programming languages

A comparison with other languages for the data scientist

MATLAB

Python

Useful links

Summary

1. Installing the Julia Platform

Installing Julia

Windows version – usable from Windows XP SP2 onwards

Ubuntu version

OS X

Building from source

Working with Julia's shell

Startup options and Julia scripts

Packages

Adding a new package

Installing and working with Julia Studio

Installing and working with IJulia

Installing Sublime-IJulia

Installing Juno

Other editors and IDEs

How Julia works

Summary

2. Variables, Types, and Operations

Variables, naming conventions, and comments

Types

Integers

Floating point numbers

Elementary mathematical functions and operations

Rational and complex numbers

Characters

Strings

Formatting numbers and strings

Regular expressions

Ranges and arrays

Other ways to create arrays

Some common functions for arrays

How to convert an array of chars to a string

Dates and times

Scope and constants

Summary

3. Functions

Defining functions

Optional and keyword arguments

Anonymous functions

First-class functions and closures

Recursive functions

Map, filter, and list comprehensions

Generic functions and multiple dispatch

Summary

4. Control Flow

Conditional evaluation

Repeated evaluation

The for loop

The while loop

The break statement

The continue statement

Exception handling

Scope revisited

Tasks

Summary

5. Collection Types

Matrices

Tuples

Dictionaries

Keys and values – looping

Sets

Making a set of tuples

Example project – word frequency

Summary

6. More on Types, Methods, and Modules

Type annotations and conversions

Type conversions and promotions

The type hierarchy – subtypes and supertypes

Concrete and abstract types

User-defined and composite types

When are two values or objects equal or identical?

Multiple dispatch example

Types and collections – inner constructors

Type unions

Parametric types and methods

Standard modules and paths

Summary

7. Metaprogramming in Julia

Expressions and symbols

Eval and interpolation

Defining macros

Built-in macros

Testing

Debugging

Benchmarking

Starting a task

Reflection capabilities

Summary

8. I/O, Networking, and Parallel Computing

Basic input and output

Working with files

Reading and writing CSV files

Using DataFrames

Other file formats

Working with TCP sockets and servers

Interacting with databases

Parallel operations and computing

Creating processes

Using low-level communications

Parallel loops and maps

Distributed arrays

Summary

9. Running External Programs

Running shell commands

Interpolation

Pipelining

Calling C and FORTRAN

Calling Python

Performance tips

Tools to use

Summary

10. The Standard Library and Packages

Digging deeper into the standard library

Julia's package manager

Installing and updating packages

Publishing a package

Graphics in Julia

Using Gadfly on data

Summary

A. List of Macros and Packages

Macros

List of packages

II. Module 2

1. Julia is Fast

Julia – fast and dynamic

Designed for speed

JIT and LLVM

Types

How fast can Julia be?

Summary

2. Analyzing Julia Performance

Timing Julia code

Tic and Toc

The @time macro

The @timev macro

The Julia profiler

Using the profiler

ProfileView

Analyzing memory allocation

Using the memory allocation tracker

Statistically accurate benchmarking

Using Benchmarks.jl

Summary

3. Types in Julia

The Julia type system

Using types

Multiple dispatch

Abstract types

Julia's type hierarchy

Composite and immutable types

Type parameters

Type inference

Type-stability

Definitions

Fixing type-instability

Performance pitfalls

Identifying type-stability

Loop variables

Kernel methods

Types in storage locations

Arrays

Composite types

Parametric composite types

Summary

4. Functions and Macros – Structuring Julia Code for High Performance

Using globals

The trouble with globals

Fixing performance issues with globals

Inlining

Default inlining

Controlling inlining

Disabling inlining

Closures and anonymous functions

FastAnonymous

Using macros for performance

The Julia compilation process

Using macros

Evaluating a polynomial

Horner's method

The Horner macro

Generated functions

Using generated functions

Using generated functions for performance

Using named parameters

Summary

5. Fast Numbers

Numbers in Julia

Integers

Integer overflow

BigInt

The floating point

Unchecked conversions for unsigned integers

Trading performance for accuracy

The fastmath macro

The K-B-N summation

Subnormal numbers

Subnormal numbers to zero

Summary

6. Fast Arrays

Array internals in Julia

Array representation and storage

Column-wise storage

Bound checking

Removing the cost of bound checking

Configuring bound checks at startup

Allocations and in-place operations

Preallocating function output

Mutating versions

Array views

SIMD parallelization

Yeppp!

Writing generic library functions with arrays

Summary

7. Beyond the Single Processor

Parallelism in Julia

Starting a cluster

Communication between Julia processes

Programming parallel tasks

@everywhere

@spawn

Parallel for

Parallel map

Distributed arrays

Shared arrays

Threading

Summary

III. Module 3

1. The Julia Environment

Introduction

Philosophy

Role in data science and big data

Comparison with other languages

Features

Getting started

Julia sources

Exploring the source stack

Juno

IJulia

A quick look at some Julia

Julia via the console

Installing some packages

A bit of graphics creating more realistic graphics with Winston

My benchmarks

Package management

Listing, adding, and removing

Choosing and exploring packages

Statistics and mathematics

Data visualization

Web and networking

Database and specialist packages

How to uninstall Julia

Adding an unregistered package

What makes Julia special

Parallel processing

Multiple dispatch

Homoiconic macros

Interlanguage cooperation

Summary

2. Developing in Julia

Integers, bits, bytes, and bools

Integers

Logical and arithmetic operators

Booleans

Arrays

Operations on matrices

Elemental operations

A simple Markov chain – cat and mouse

Char and strings

Characters

Strings

Unicode support

Regular expressions

Byte array literals

Version literals

An example

Real, complex, and rational numbers

Reals

Operators and built-in functions

Special values

BigFloats

Rationals

Complex numbers

Juliasets

Composite types

More about matrices

Vectorized and devectorized code

Multidimensional arrays

Broadcasting

Sparse matrices

Data arrays and data frames

Dictionaries, sets, and others

Dictionaries

Sets

Other data structures

Summary

3. Types and Dispatch

Functions

First-class objects

Passing arguments

Default and optional arguments

Variable argument list

Named parameters

Scope

The Queen's problem

Julia's type system

A look at the rational type

A vehicle datatype

Typealias and unions

Enumerations (revisited)

Multiple dispatch

Parametric types

Conversion and promotion

Conversion

Promotion

A fixed vector module

Summary

4. Interoperability

Interfacing with other programming environments

Calling C and Fortran

Mapping C types

Array conversions

Type correspondences

Calling a Fortran routine

Calling curl to retrieve a web page

Python

Some others to watch

The Julia API

Calling API from C

Metaprogramming

Symbols

Macros

Testing

Error handling

The enum macro

Tasks

Parallel operations

Distributed arrays

A simple MapReduce

Executing commands

Running commands

Working with the filesystem

Redirection and pipes

Perl one-liners

Summary

5. Working with Data

Basic I/O

Terminal I/O

Disk files

Text processing

Binary files

Structured datasets

CSV and DLM files

HDF5

XML files

DataFrames and RDatasets

The DataFrames package

DataFrames

RDatasets

Subsetting, sorting, and joining data

Statistics

Simple statistics

Samples and estimations

Pandas

Selected topics

Time series

Distributions

Kernel density

Hypothesis testing

GLM

Summary

6. Scientific Programming

Linear algebra

Simultaneous equations

Decompositions

Eigenvalues and eigenvectors

Special matrices

A symmetric eigenproblem

Signal processing

Frequency analysis

Filtering and smoothing

Digital signal filters

Image processing

Differential equations

The solution of ordinary differential equations

Non-linear ordinary differential equations

Partial differential equations

Optimization problems

JuMP

Optim

NLopt

Using with the MathProgBase interface

Stochastic problems

Stochastic simulations

SimJulia

Bank teller example

Bayesian methods and Markov processes

Monte Carlo Markov Chains

MCMC frameworks

Summary

7. Graphics

Basic graphics in Julia

Text plotting

Cairo

Winston

Data visualization

Gadfly

Compose

Graphic engines

PyPlot

Gaston

PGF plots

Using the Web

Bokeh

Plotly

Raster graphics

Cairo (revisited)

Winston (revisited)

Images and ImageView

Summary

8. Databases

A basic view of databases

The red pill or the blue pill?

Interfacing to databases

Other considerations

Relational databases

Building and loading

Native interfaces

ODBC

Other interfacing techniques

DBI

SQLite

MySQL

PostgreSQL

PyCall

JDBC

NoSQL datastores

Key-value systems

Document datastores

RESTful interfacing

JSON

Web-based databases

Graphic systems

Summary

9. Networking

Sockets and servers

Well-known ports

UDP and TCP sockets in Julia

A "Looking-Glass World" echo server

Named pipes

Working with the Web

A TCP web service

The JuliaWeb group

The "quotes" server

WebSockets

Messaging

E-mail

Twitter

SMS and esendex

Cloud services

Introducing Amazon Web Services

The AWS.jl package

The Google Cloud

Summary

10. Working with Julia

Under the hood

Femtolisp

The Julia API

Code generation

Performance tips

Best practice

Profiling

Lint

Debugging

Developing a package

Anatomy

Taxonomy

Using Git

Publishing

Community groups

Classifications

JuliaAstro

Cosmology models

The Flexible Image Transport System

The high-level API

The low-level API

JuliaGPU

What's missing?

Summary

A. Bibliography

Index

Julia: High Performance Programming

Leverage the power of Julia to design and develop high performing programs

A course in three modules

BIRMINGHAM - MUMBAI

Julia: High Performance Programming

All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this course to ensure the accuracy of the information presented. However, the information contained in this course is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this course.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this course by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Published on: November 2016

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78712-570-4

www.packtpub.com

Credits

Authors

Ivo Balbaert

Avik Sengupta

Malcolm Sherrington

Reviewers

Pascal Bugnion

Michael Otte

Dustin Stansbury

Zhuo QL

Gururaghav Gopal

Dan Wlasiuk

Content Development Editor

Priyanka Mehta

Graphics

Disha Haria

Production Coordinator

Aparna Bhagat

Preface

Julia is a relatively young programming language. The initial design work on the Julia project began at MIT in August 2009, and by February 2012, it became open source. It is largely the work of three developers Stefan Karpinski, Jeff Bezanson, and Viral Shah. These three, together with Alan Edelman, still remain actively committed to Julia and MIT currently hosts a variety of courses in Julia, manyof which are available over the Internet.

Initially, Julia was envisaged by the designers as a scientific language sufficiently rapid to make the necessity of modeling in an interactive language and subsequently having to redevelop in a compiled language, such as C or Fortran. At that time the major scientific languages were propriety ones such as MATLAB and Mathematica, and were relatively slow. There were clones of these languages in the open source domain, such as GNU Octave and Scilab, but these were even slower. When it launched, the community saw Julia as a replacement for MATLAB, but this is not exactly the case. Although the syntax of Julia is similar to MATLAB, so much so that anyone competent in MATLAB can easily learn Julia, it was not designed as a clone. It is a more feature-rich language with many significant differences that will be discussed in depth later.

The period since 2009 has seen the rise of two new computing disciplines: big data/cloud computing and data science. Big data processing on Hadoop is conventionally seen as the realm of Java programming, since Hadoop runs on the Java virtual machine. It is, of course, possible to process big data by using programming languages other than those that are Java-based and utilize the streaming-jar paradigm, and Julia can be used in a way similar to C++, C#, and Python.

The emergence of data science heralded the use of programming languages that were simple for analysts with some programming skills but who were not principally programmers. The two languages that stepped up to fill the breach have been R and Python. Both of these are relatively old with their origins back in the 1990s. However, the popularity of these two has seen a rapid growth, ironically from around the time when Julia was introduced to the world. Even so, with such estimated and staid opposition, Julia has excited the scientific programming community and continues to make inroads in this space.

The aim of this course is to cover all aspects of Julia that make it appealing to the data scientist. The language is evolving quickly. Binary distributions are available for Linux, Mac OS X, and Linux, but these will lag behind the current sources. So, to do some serious work with Julia, it is important to understand how to obtain and build a running system from source. In addition, there are interactive development environments available for Julia and the course will discuss both the Jupyter and Juno IDEs.

What this learning path covers

Module 1, Getting Started with Julia, a head start to tackle your numerical and data problems with Julia. Your journey will begin by learning how to set up a running Julia platform before exploring its various built-in types. You will then move on to cover the different functions and constructs in Julia. The module will then walk you through the two important collection types―arrays and matrices. Over the course of the module, you will also be introduced to homoiconicity, the meta-programming concept in Julia. Towards the concluding part of the module, you will also learn how to run external programs. This module will cover all you need to know about Julia to leverage its high speed and efficiency.

Module 2, Julia High Performance, will take you on a journey to understand the performance characteristics of your Julia programs, and enables you to utilize the promise of near C levels of performance in Julia. You will learn to analyze and measure the performance of Julia code, understand how to avoid bottlenecks, and design your program for the highest possible performance. In this module, you will also see how Julia uses type information to achieve its performance goals, and how to use multiple dispatch to help the compiler to emit high performance machine code. Numbers and their arrays are obviously the key structures in scientific computing – you will see how Julia's design makes them fast.

Module 3, Mastering Julia, you will compare the different ways of working with Julia and explore Julia's key features in-depth by looking at design and build. You will see how data works using simple statistics and analytics, and discover Julia's speed, its real strength, which makes it particularly useful in highly intensive computing tasks and observe how Julia can cooperate with external processes in order to enhance graphics and data visualization. Finally, you will look into meta-programming and learn how it adds great power to the language and establish networking and distributed computing with Julia.

What you need for this learning path

Developing in Julia can be done under any of the familiar computing operating systems: Linux, OS X, and Windows. To explore the language in depth, the reader may wish to acquire the latest versions and to build from source under Linux. However, to work with the language using a binary distribution on any of the three platforms, the installation is very straightforward and convenient. In addition, Julia now comes pre-packaged with the Juno IDE, which just requires expansion from a compressed (zipped) archive.

Who this learning path is for

This learning path assumes that you already have some basic working knowledge of Julia's syntax and high-level dynamic languages such as MATLAB, R, Python, or Ruby.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this course—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the title of the course in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to any of our product, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt course, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this course from your account at http://www.packtpub.com. If you purchased this course elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the course in the Search box.Select the course for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

You can also download the code files by clicking on the Code Files button on the course's webpage at the Packt Publishing website. This page can be accessed by entering the course's name in the Search box. Please note that you need to be logged into your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the course is also hosted on GitHub at https://github.com/PacktPublishing/Julia-High-Performance-Programming. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books/courses—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this course. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your course, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book/course in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this course, you can contact us at <[email protected]>, and we will do our best to address the problem.

Part I. Module 1

Getting Started with Julia

Enter the exciting world of Julia, a high-performance language for technical computing

The Rationale for Julia

This introduction will present you with the reasons why Julia is quickly growing in popularity in the technical, data scientist, and high-performance computing arena. We will cover the following topics:

The scope of JuliaJulia's place among other programming languagesA comparison with other languages for the data scientistUseful links

The scope of Julia

The core designers and developers of Julia (Jeff Bezanson, Stefan Karpinski, and Viral Shah) have made it clear that Julia was born out of a deep frustration with the existing software toolset in the technical computing disciplines. Basically, it boils down to the following dilemma:

Prototyping is a problem in this domain that needs a high-level, easy-to-use, and flexible language that lets the developer concentrate on the problem itself instead of on low-level details of the language and computation.The actual computation of a problem needs maximum performance; a factor of 10 in computation time makes a world of difference (think of one day versus ten days), so the production version often has to be (re)written in C or FORTRAN.Before Julia, practitioners had to be satisfied with a "speed for convenience" trade-off, use developer-friendly and expressive, but decades-old interpreted languages such as MATLAB, R, or Python to express the problem at a high level. To program the performance-sensitive parts and speed up the actual computation, people had to resort to statically compiled languages such as C or FORTRAN, or even the assembly code. Mastery on both the levels is not evident: writing high-level code in MATLAB, R, or Python for prototyping on the one hand, and writing code that does the same thing in C, which is used for the actual execution.

Julia was explicitly designed to bridge this gap. It gives you the possibility of writing high-performance code that uses CPU and memory resources as effectively as can be done in C, but working in pure Julia all the way down, reduces the need for a low-level language. This way, you can rapidly iterate using a simple programming model from the problem prototype to near-C performance. The Julia developers have proven that working in one environment that has the expressive capabilities as well as the pure speed is possible using the recent advances in Low Level Virtual Machine Just in Time (LLVM JIT) compiler technologies (for more information, see http://en.wikipedia.org/wiki/LLVM).

In summary, they designed Julia to have the following specifications:

Julia is open source and free with a liberal (MIT) license.It is designed to be an easy-to-use and learn, elegant, clear and dynamic, interactive language by reducing the development time. To that end, Julia almost looks like the pseudo code with an obvious and familiar mathematical notation; for example, here is the definition for a polynomial function, straight from the code:

x -> 7x^3 + 30x^2 + 5x + 42

Notice that there is no need to indicate the multiplications.

It provides the computational power and speed without having to leave the Julia environment.Metaprogramming and macro capabilities (due to its homoiconicity (refer to Chapter 7, Metaprogramming in Julia), inherited from Lisp), to increase its abstraction power.Also, it is usable for general programming purposes, not only in pure computing disciplines.It has built-in and simple to use concurrent and parallel capabilities to thrive in the multicore world of today and tomorrow.

Julia unites this all in one environment, something which was thought impossible until now by most researchers and language designers.

The Julia logo

Julia's place among the other programming languages

Julia reconciles and brings together the technologies that before were considered separate, namely:

The dynamic, untyped, and interpreted languages on the one hand (Python, Ruby, Perl, MATLAB/Octave, R, and so on)The statically typed and compiled languages on the other (C, C++, Fortran, and Fortress)

How can Julia have the flexibility of the first and the speed of the second category?

Julia has no static compilation step. The machine code is generated just-in-time by an LLVM-based JIT compiler. This compiler, together with the design of the language, helps Julia to achieve maximal performance for numerical, technical, and scientific computing. The key for the performance is the type information, which is gathered by a fully automatic and intelligent type inference engine, that deduces the type from the data contained in the variables. Indeed, because Julia has a dynamic type system, declaring the type of variables in the code is optional. Indicating types is not necessary, but it can be done to document the code, improve tooling possibilities, or in some cases, to give hints to the compiler to choose a more optimized execution path. This optional typing discipline is an aspect it shares with Dart. Typeless Julia is a valid and useful subset of the language, similar to traditional dynamic languages, but it nevertheless runs at statically compiled speeds. Julia applies generic programming and polymorphic functions to the limit, writing an algorithm just once and applying it to a broad range of types. This provides common functionality across drastically different types, for example: size is a generic function with 50 concrete method implementations. A system called dynamic multiple dispatch efficiently picks the optimal method for all of a function's arguments from tens of method definitions. Depending on the actual types very specific and efficient native code implementations of the function are chosen or generated, so its type system lets it align closer with primitive machine operations.

Note

In summary, data flow-based type inference implies multiple dispatch choosing specialized execution code.

However, do keep in mind that types are not statically checked. Exceptions due to type errors can occur at runtime, so thorough testing is mandatory. As to categorizing Julia in the programming language universe, it embodies multiple paradigms, such as procedural, functional, metaprogramming, and also (but not fully) object oriented. It is by no means an exclusively class-based language such as Java, Ruby, or C#. Nevertheless, its type system offers a kind of inheritance and is very powerful. Conversions and promotions for numeric and other types are elegant, friendly, and swift, and user-defined types are as fast and compact as built-in types. As for functional programming, Julia makes it very easy to design programs with pure functions and has no side effects; functions are first-class objects, as in mathematics.

Julia also supports a multiprocessing environment based on a message passing model to allow programs to run via multiple processes (local or remote) using distributed arrays, enabling distributed programs based on any of the models for parallel programming.

Julia is equally suited for general programming as is Python. It has as good and modern (Unicode capable) string processing and regular expressions as Perl or other languages. Moreover, it can also be used at the shell level, as a glue language to synchronize the execution of other programs or to manage other processes.

Julia has a standard library written in Julia itself, and a built-in package manager based on GitHub, which is called Metadata, to work with a steadily growing collection of external libraries called packages. It is cross platform, supporting GNU/Linux, Darwin/OS X, Windows, and FreeBSD for both x86/64 (64-bit) and x86 (32-bit) architectures.

A comparison with other languages for the data scientist

Because speed is one of the ultimate targets of Julia, a benchmark comparison with other languages is displayed prominently on the Julia website (http://julialang.org/). It shows that Julia's rivals C and Fortran, often stay within a factor of two of fully optimized C code, and leave the traditional dynamic language category far behind. One of Julia's explicit goals is to have sufficiently good performance that you never have to drop down into C. This is in contrast to the following environments, where (even for NumPy) you often have to work with C to get enough performance when moving to production. So, a new era of technical computing can be envisioned, where libraries can be developed in a high-level language instead of in C or FORTRAN. Julia is especially good at running MATLAB and R-style programs. Let's compare them somewhat more in detail.

MATLAB

Julia is instantly familiar to MATLAB users; its syntax strongly resembles that of MATLAB, but Julia aims to be a much more general purpose language than MATLAB. The names of most functions in Julia correspond to the MATLAB/Octave names, and not the R names. Under the covers, however, the way the computations are done, things are extremely different. Julia also has equally powerful capabilities in linear algebra, the field where MATLAB is traditionally applied. However, using Julia won't give you the same license fee headaches. Moreover, the benchmarks show that it is from 10 to 1,000 times faster depending on the type of operation, also when compared to Octave (the open source version of MATLAB). Julia provides an interface to the MATLAB language with the package MATLAB.jl (https://github.com/lindahua/MATLAB.jl).

R

R was until now the chosen development language in the statistics domain. Julia proves to be as usable as R in this domain, but again with a performance increase of a factor of 10 to 1,000. Doing statistics in MATLAB is frustrating, as is doing linear algebra in R, but Julia fits both the purposes. Julia has a much richer type system than the vector-based types of R. Some statistics experts such as Douglas Bates heavily support and promote Julia as well. Julia provides an interface to the R language with the package Rif.jl (https://github.com/lgautier/Rif.jl).

Python

Again, Julia has a performance head start of a factor of 10 to 30 times as compared to Python. However, Julia compiles the code that reads like Python into machine code that performs like C. Furthermore, if necessary you can call Python functions from within Julia using the PyCall package (https://github.com/stevengj/PyCall.jl).

Because of the huge number of existing libraries in all these languages, any practical data scientist can and will need to mix the Julia code with R or Python when the problem at hand demands it.

Julia can also be applied to data analysis and big data, because these often involve predictive analysis, modeling problems that can often be reduced to linear algebra algorithms, or graph analysis techniques, all things Julia is good at tackling.

In the field of High Performance Computing (HPC), a language such as Julia has long been lacking. With Julia, domain experts can experiment and quickly and easily express a problem in such a way that they can use modern HPC hardware as easily as a desktop PC. In other words, a language that gets users started quickly without the need to understand the details of the underlying machine architecture is very welcome in this area.

Useful links

The following are the links that can be useful while using Julia:

The main Julia website can be found at http://julialang.org/For documentation, refer to http://docs.julialang.org/en/latestView the packages at http://pkg.julialang.org/index.htmlSubscribe to the mailing lists at http://julialang.org/community/Get support at an IRC channel from http://webchat.freenode.net/?channels=julia

Summary

In this introduction, we gave an overview of Julia's characteristics and compared them to the existing languages in its field. Julia's main advantage is its ability to generate specialized code for different input types. When coupled with the compiler's ability to infer these types, this makes it possible to write the Julia code at an abstract level while achieving the efficiency associated with the low-level code. Julia is already quite stable and production ready. The learning curve for Julia is very gentle; the idea being that people who don't care about fancy language features should be able to use it productively too and learn about new features only when they become useful or needed.

Chapter 1. Installing the Julia Platform

This chapter guides you through the download and installation of all the necessary components of Julia. The topics covered in this chapter are as follows:

Installing JuliaWorking with Julia's shellStart-up options and Julia scriptsPackagesInstalling and working with Julia StudioInstalling and working with IJuliaInstalling Sublime-IJuliaInstalling JunoOther editors and IDEsWorking of Julia

By the end of this chapter, you will have a running Julia platform. Moreover, you will be able to work with Julia's shell as well as with editors or integrated development environments with a lot of built-in features to make development more comfortable.

Installing Julia

The Julia platform in binary (that is, executable) form can be downloaded from http://julialang.org/downloads/. It exists for three major platforms (Windows, Linux, and OS X) in 32- and 64-bit format, and is delivered as a package or in an archive format. You should use the current official stable release when doing serious professional work with Julia (at the time of writing, this is Version 0.3). If you would like to investigate the latest developments, install the upcoming version (which is now Version 0.4). The previous link contains detailed and platform-specific instructions for the installation. We will not repeat these instructions here completely, but we will summarize some important points.

Windows version – usable from Windows XP SP2 onwards

You need to keep the following things in mind if you are using the Windows OS:

As a prerequisite, you need the 7zip extractor program, so first download and install http://www.7-zip.org/download.html.Now, download the julia-n.m.p-win64.exe file to a temporary folder (n.m.p is the version number, such as 0.2.1 or 0.3.0; win32/win64 are respectively the 32- and 64-bit version; a release candidate file looks like julia-0.4.0-rc1-nnnnnnn-win64 (nnnnnnn is a checksum number such as 0480f1b).Double-click on the file (or right-click, and select Run as Administrator if you want Julia installed for all users on the machine). Clicking OK on the security dialog message, and then choosing the installation directory (for example, c:\julia) will extract the archive into the chosen folder, producing the following directory structure, and taking some 400 MB of disk space:

The Julia folder structure in Windows

A menu shortcut will be created which, when clicked, starts the Julia command-line version or Read Evaluate Print Loop (REPL), as shown in the following screenshot:

The Julia REPL

On Windows, if you have chosen C:\Julia as your installation directory, this is the C:\Julia\bin\julia.exe file. Add C:\Julia\bin to your PATH variable if you want the REPL to be available on any Command Prompt. The default installation folder on Windows is: C:\Users\UserName\AppData\Local\Julia-n.m.p (where n.m.p is the version number, such as 0.3.2).More information on Julia in the Windows OS can be found at https://github.com/JuliaLang/julia/blob/master/README.windows.md.

Ubuntu version

For Ubuntu systems (Version 12.04 or later), there is a Personal Package Archive (PPA) for Julia (can be found at https://launchpad.net/~staticfloat/+archive/ubuntu/juliareleases) that makes the installation painless. All you need to do to get the stable version is to issue the following commands in a terminal session:

sudo add-apt-repository ppa:staticfloat/juliareleasessudo add-apt-repository ppa:staticfloat/julia-depssudo apt-get updatesudo apt-get install julia

If you want to be at the bleeding edge of development, you can download the nightly builds instead of the stable releases. The nightly builds are generally less stable, but will contain the most recent features. To do so, replace the first of the preceding commands with:

sudo add-apt-repository ppa:staticfloat/julianightlies

This way, you can always upgrade to a more recent version by issuing the following commands:

sudo apt-get updatesudo apt-get upgrade

The Julia executable lives in /usr/bin/julia (given by the JULIA_HOME variable or by the which julia command) and the standard library is installed in /usr/share/julia/base, with shared libraries in /usr/lib/x86_64-linux-gnu/Julia.

For other Linux versions, the best way to get Julia running is to build from source (refer to the next section).

OS X

Installation for OS X is straightforward—using the standard software installation tools for the platform. Add /Applications/Julia-n.m.app/Contents/Resources/julia/bin/Julia to make Julia available everywhere on your computer.

If you want code to be run whenever you start a Julia session, put it in /home/.juliarc.jl on Ubuntu, ~/.juliarc.jl on OS X, or c:\Users\username\.juliarc.jl on Windows. For instance, if this file contains the following code:

println("Greetings! 你好! 안녕하세요?")

Then, Julia starts up in its shell (or REPL as it is usually called) with the following text in the screenshot, which shows its character representation capabilities:

Using .juliarc.jl

Building from source

Perform the following steps to build Julia from source:

Download the source code, rather than the binaries, if you intend to contribute to the development of Julia itself, or if no Julia binaries are provided for your operating system or particular computer architecture. Building from source is quite straightforward on Ubuntu, so we will outline the procedure here. The Julia source code can be found on GitHub at https://github.com/JuliaLang/julia.git.Compiling these will get you the latest Julia version, not the stable version (if you want the latter, download the binaries, and refer to the previous section).Make sure you have git installed; if not, issue the command:

sudo apt-get -f install git

Then, clone the Julia sources with the following command:

git clone git://github.com/JuliaLang/julia.git

This will download the Julia source code into a julia directory in the current folder.

The Julia building process needs the GNU compilation tools g++, gfortran, and m4, so make sure that you have installed them with the following command:

sudo apt-get install gfortran g++ m4

Now go to the Julia folder and start the compilation process as follows:

cd juliamake

After a successful build, Julia starts up with the ./julia command.Afterwards, if you want to download and compile the newest version, here are the commands to do this in the Julia source directory:

git pullmake cleanmake

For more information on how to build Julia on Windows, OS X, and other systems, refer to https://github.com/JuliaLang/julia/.

Tip

Using parallelization

If you want Julia to use n concurrent processes, compile the source with make -j n.

There are two ways of using Julia. As described in the previous section, we can use the Julia shell for interactive work. Alternatively, we can write programs in a text file, save them with a .jl extension, and let Julia execute the whole program sequentially.

Packages

Most of the standard library in Julia (can be found in /share/julia/base relative to where Julia was installed) is written in Julia itself. The rest of Julia's code ecosystem is contained in packages that are simply Git repositories. They are most often authored by external contributors, and already provide functionality for such diverse disciplines such as bioinformatics, chemistry, cosmology, finance, linguistics, machine learning, mathematics, statistics, and high-performance computing. A searchable package list can be found at http://pkg.julialang.org/. Official Julia packages are registered in the METADATA.jl file in the Julia Git repository, available on GitHub at https://github.com/JuliaLang/METADATA.jl.

Julia's installation contains a built-in package manager Pkg for installing additional Julia packages written in Julia. The downloaded packages are stored in a cache ready to be used by Julia given by Pkg.dir(), which are located at c:\users\username\.julia\vn.m\.cache, /home/$USER/.julia/vn.m/.cache, or ~/.julia/vn.m/.cache. If you want to check which packages are installed, run the Pkg.status() command in the Julia REPL, to get a list of packages with their versions, as shown in the following screenshot:

Packages list

The Pkg.installed() command gives you the same information, but in a dictionary form and is usable in code. Version and dependency management is handled automatically by Pkg. Different versions of Julia can coexist with incompatible packages, each version has its own package cache.

Tip

If you get an error with Pkg.status() such as ErrorException("Unable to read directory METADATA."), issue a Pkg.init() command to create the package repository folders, and clone METADATA from Git. If the problem is not easy to find or the cache becomes corrupted somehow, you can just delete the .julia folder, enter Pkg.init(), and start with an empty cache. Then, add the packages you need.

Adding a new package

Before adding a new package, it is always a good idea to update your package database for the already installed packages with the Pkg.update()command. Then, add a new package by issuing the Pkg.add("PackageName") command, and execute using PackageName in code or in the REPL. For example, to add 2D plotting capabilities, install the Winston package with Pkg.add("Winston "). To make a graph of 100 random numbers between 0 and 1, execute the following commands:

using Winstonplot(rand(100))

The rand(100) function is an array with 100 random numbers. This produces the following output:

A plot of white noise with Winston

After installing a new Julia version, update all the installed packages by running Pkg.update() in the REPL. For more detailed information, you can refer to http://docs.julialang.org/en/latest/manual/packages/.

Installing and working with Julia Studio

Julia Studio is a free desktop app for working with Julia that runs on Linux, Windows, and OS X (http://forio.com/labs/julia-studio/). It works with the 0.3 release on Windows (Version 0.2.1 for Linux and OS X, at this time, if you want Julia Studio to work with Julia v0.3 on Linux and OS X, you have to do the compilation of the source code of the Studio yourself). It contains a sophisticated editor and integrated REPL, version control with Git, and a very handy side pane with access to the command history, filesystem, packages, and the list of edited documents. It is created by Forio, a company that makes software for simulations, data explorations, interactive learning, and predictive analytics. In the following screenshot, you can see some of Julia Studio's features, such as the Console section and the green Run button (or F5) in the upper-right corner. The simple program fizzbuzz.jl prints for the first 100 integers for "fizz" if the number is a multiple of 3, "buzz" if a multiple of 5, and "fizzbuzz" if it is a multiple of 15.

Julia Studio

Notice the # sign that indicates the beginning of comments, the elegant and familiar for loop and if elseif construct, and how they are closed with end. The 1:100 range is a range; mod returns the remainder of the division; the function mod(i, n) can also be written as an i % n operator. Using four spaces for indentation is a convention. Recently, Forio also developed Epicenter, a computational platform for hosting the server-side models (also in Julia), and building interactive web interfaces for these models.

Installing Sublime-IJulia

The popular Sublime Text editor (http://www.sublimetext.com/3) now has a plugin based on IJulia (https://github.com/quinnj/Sublime-IJulia) authored by Jacob Quinn. It gives you syntax highlighting, autocompletion, and an in-editor REPL, which you basically just open like any other text file, but it runs Julia code for you. You can also select some code from a code file and send it to the REPL with the shortcut CTRL + B, or send the entire file there. Sublime-IJulia provides a frontend to the IJulia backend kernel, so that you can start an IJulia frontend in a Sublime view and interact with the kernel. Here is a summary of the installation, for details you can refer to the preceding URL:

From within the Julia REPL, install the ZMQ and IJulia packages.From within Sublime Text, install the Package Control package (https://sublime.wbond.net/installation).From within Sublime Text, install the IJulia package from the Sublime command palette.Ctrl + Shift + P opens up a new IJulia console. Start entering commands, and press Shift + Enter to execute them. The Tab key provides command completion.

Installing Juno

Another promising IDE for Julia and a work in progress by Mike Innes and Keno Fisher is Juno, which is based on the Light Table environment. The docs at http://junolab.org/docs/installing.html provides detailed instructions for installing and configuring Juno. Here is a summary of the steps:

Get LightTable from http://lighttable.com.Start LightTable, install the Juno plugin through its plugin manager, and restart LightTable.

Light Table works extensively with a command palette that you can open by typing Ctrl + SPACE, entering a command, and then selecting it. Juno provides an integrated console, and you can evaluate single expressions in the code editor directly by typing Ctrl + Enter at the end of the line. A complete script is evaluated by typing Ctrl + Shift + Enter.

Other editors and IDEs

For terminal users, the available editors are as follows:

Vimtogether with Julia-vim works great (https://github.com/JuliaLang/julia-vim)Emacswith julia-mode.el from the https://github.com/JuliaLang/julia/tree/master/contrib directory

On Linux, gedit is very good. The Julia plugin works well and provides autocompletion. Notepad++ also has Julia support from the contrib directory mentioned earlier.

The SageMath project (https://cloud.sagemath.com/) runs Julia in the cloud within a terminal and lets you work with IPython notebooks. You can also work and teach with Julia in the cloud using the JuliaBoxplatform (https://juliabox.org/).

Summary

By now, you should have been able to install Julia in a working environment you prefer. You should also have some experience with working in the REPL. We will put this to good use starting in the next chapter, where we will meet the basic data types in Julia, by testing out everything in the REPL.

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben:

Julia: High Performance Programming E-Book

Ivo Balbaert

About This Book

Who This Book Is For

What You Will Learn

In Detail

Style and approach

Table of Contents

Julia: High Performance Programming

Julia: High Performance Programming

Julia: High Performance Programming

Credits

Preface

What this learning path covers

What you need for this learning path

Who this learning path is for

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

Part I. Module 1

The Rationale for Julia

The scope of Julia

Julia's place among the other programming languages

Note

A comparison with other languages for the data scientist

MATLAB

R

Python

Useful links

Summary

Chapter 1. Installing the Julia Platform

Installing Julia

Windows version – usable from Windows XP SP2 onwards

Ubuntu version

OS X

Building from source

Tip

Packages

Tip

Adding a new package

Installing and working with Julia Studio

Installing Sublime-IJulia

Installing Juno

Other editors and IDEs

Summary