34,79 €
Julia is a highly appropriate language for scientific computing, but it comes with all the required capabilities of a general-purpose language. It allows us to achieve C/Fortran-like performance while maintaining the concise syntax of a scripting language such as Python. It is perfect for building high-performance and concurrent applications. From the basics of its syntax to learning built-in object types, this book covers it all.
This book shows you how to write effective functions, reduce code redundancies, and improve code reuse. It will be helpful for new programmers who are starting out with Julia to explore its wide and ever-growing package ecosystem and also for experienced developers/statisticians/data scientists who want to add Julia to their skill-set.
The book presents the fundamentals of programming in Julia and in-depth informative examples, using a step-by-step approach. You will be taken through concepts and examples such as doing simple mathematical operations, creating loops, metaprogramming, functions, collections, multiple dispatch, and so on.
By the end of the book, you will be able to apply your skills in Julia to create and explore applications of any domain.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 303
Veröffentlichungsjahr: 2017
BIRMINGHAM - MUMBAI
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: November 2017
Production reference: 1221117
ISBN 978-1-78588-327-9
www.packtpub.com
Authors
Anshul Joshi
Rahul Lakhanpal
Copy Editor
Safis Editing
Reviewer
Nicholas Paul
Project Coordinator
Vaidehi Sawant
Commissioning Editor
Kunal Parikh
Proofreader
Safis Editing
Acquisition Editor
Denim Pinto
Indexer
Francy Puthiry
Content Development Editor
Rohit Kumar Singh
Graphics
Jason Monteiro
Technical Editor
Ketan Kamble
Production Coordinator
Nilesh Mohite
Anshul Joshi is a data scientist with experience in recommendation systems, predictive modeling, neural networks, and high performance computing. His research interests encompass deep learning, artificial intelligence, and computational physics. Most of the time, he can be caught exploring GitHub or trying anything new he can get his hands on. He blogs at https://anshuljoshi.com/.
Rahul Lakhanpal is a technology and open source enthusiast. With diversified skills including systems engineering, web development, the cloud, and big data, he is language-agnostic and a firm believer in using the best tools and the right language for a particular job.
Rahul is an active contributor to various community portals and loves to solve challenging real-world problems in his leisure time. He is active on Twitter and blogs at http://rahullakhanpal.in/.
Nicholas Paul is a computer science graduate student studying intelligent systems and machine learning. He is involved in the campus as a tutor, research assistant, math club president, and captain of the autonomous vehicle development team. He has several years of experience using Julia for projects ranging anywhere from large-scale social media data mining and analysis to command-line interfaces and text editors. He is currently writing an open source learning platform for RobotOS for the students of his university, writing research papers focused on computer vision and deep leaning, and developing open source esoteric programming languages.
For support files and downloads related to your book, please visit www.PacktPub.com. Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1785883275.
If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
Understanding Julia's Ecosystem
What makes Julia unique?
Features and advantages of Julia
Installing Julia
Julia on Ubuntu (Linux)
Julia on Fedora/CentOS/Red Hat (Linux)
Julia on Windows
Julia on Mac
Building from source
Understanding the directory structure of Julia's source
Julia's source stack
Julia's importance in data science
Benchmarks
Using REPL
Using help in Julia
Plots in REPL
Using Jupyter Notebook
What is Juno?
Package management
Pkg.status() – package status
Pkg.add() – adding packages
Working with unregistered packages
Pkg.update() – package update
METADATA repository
Developing packages
Creating a new package
A brief about multiple dispatch
Methods in multiple dispatch
Understanding LLVM and JIT
Summary
References
Programming Concepts with Julia
Revisiting programming paradigms
Imperative programming paradigm
Logical programming paradigm
Functional programming paradigm
Object-oriented paradigm
Starting with Julia REPL
Variables in Julia
Naming conventions
Integers, bits, bytes, and bools
Playing with integers in REPL
Understanding overflow behavior
Understanding the Boolean data type
Floating point numbers in Julia
Special functions on floating point numbers
Operations on floating point numbers
Computations with arbitrary precision arithmetic
Writing expressions with coefficients
Logical and arithmetic operations in Julia
Performing arithmetic operations
Performing bitwise operations
Operators for comparison and updating
Precedence of operators
Type conversions (numerical)
Understanding arrays, matrices, and multidimensional arrays
List comprehension in Julia
Creating an empty array
Operations on arrays
Working with matrices
Different operation on matrices
Working with multidimensional arrays (matrices)
Understanding sparse matrices
Understanding DataFrames
NA data type in DataArray
The requirement of the NA data type
DataArray – a series-like data structure
DataFrames – tabular data structures
Summary
Functions in Julia
Creating functions
The special !
Function arguments
Pass by values versus pass by reference
Pass by sharing
The return keyword
Arguments
No arguments
Varargs
Optional arguments
Understanding scope with respect to functions
Nested functions
Anonymous functions
Multiple dispatch
Understanding methods
Recursion
Built-in functions
An example using simple built-in functions
Summary
Understanding Types and Dispatch
Julia's type system
What are types?
Statically-typed versus dynamically-typed languages
So, is Julia a dynamically-typed or statically-typed language?
Type annotations
More on types
The Integer type
The Float type
The Char type
The String type
The Bool type
Type conversions
The subtypes and supertypes
The supertype() function
The subtype() function
User-defined and composite data types
Composite types
Inner constructors
Modules and interfaces
Including files in modules
Module file paths
What is module precompilation?
Multiple dispatch explained
Summary
Working with Control Flow
Conditional and repeated evaluation
Conditional evaluation in detail
Short-circuit evaluation
Repeated evaluation
Defining range
Some more examples of the for loop
The break and continue
Exception handling
The throw() function
The error() function
The try/catch/finally blocks
Tasks in Julia
Summary
Interoperability and Metaprogramming
Interacting with operating systems
Filesystem operations
I/O operations
Example
Calling C and Python!
Calling C from Julia
Calling Python from Julia
Expressions and macros
Macros
But why metaprogramming?
Built-in macros
Type introspection and reflection capabilities
Type introspection
Reflection capabilities
Summary
Numerical and Scientific Computation with Julia
Working with data
Working with text files
Working with CSV and delimited file formats
Working with DataFrames
NA
DataArrays
DataFrames
Linear algebra and differential calculus
Linear algebra
Differential calculus
Statistics
Simple statistics
Basic statistics using DataFrames
Using Pandas
Advanced statistics topics
Distributions
TimeSeries
Hypothesis testing
Optimization
JuMP
Convex.jl
Summary
Data Visualization and Graphics
Basic plots
Bar graphs
Histograms
Pie charts
Scatter plots
3-D surface plots
Vega
Area plots
Aster plots
Choropleth map
Heatmaps
Ribbon plots
Wordcloud
Scatter plots
Gadfly
Interacting with Gadfly using the plot function
Plotting DataFrames with Gadfly
Summary
Connecting with Databases
How to connect with databases?
Relational databases
SQLite
MySQL
NoSQL databases
MongoDB
Introduction to REST
What is JSON?
Web frameworks
Summary
Julia’s Internals
Under the hood
Femtolisp
The Julia Core API
Performance enhancements
Global variables
Type declarations
Fields with abstract types
Container fields with abstract type
Declaring type for keyword arguments
Miscellaneous performance tweaks
Standard library
LLVM and JIT explained
Parallel computing
Focusing on global variables
Running loops in parallel
TCP sockets and servers
Sockets
Creating packages
Guidelines for package naming
Generating a package
Summary
Julia is a high-level, high-performance, dynamic programming language for numerical computing. It offers a unique combination of performance and productivity that promises to change scientific computing and programming. Julia was created to solve the dilemma between high-level, slow code and fast but low-level code, and the necessity to use both to achieve high-performance. It also puts performance center stage, achieving C-like execution speed and excellent applications in multicore, GPU, and cloud computing.
This book demonstrates the basics of Julia along with some data structures and testing tools that will give you enough material to get started with the language from an application standpoint. You will learn and take advantage of Julia while building applications with complex numerical and scientific computations. Through the journey of this book, you will explore the technical aspects of Julia and its potential when it comes to speed and data processing. Also, you will learn to write efficient and high quality code in Julia.
Chapter 1, Understanding Julia's Ecosystem, describes the steps needed to set up the Julia ecosystem. It will also help to understand how the packages are downloaded, installed, updated, and removed. This chapter will also briefly introduce the features of Julia that we will be studying in detail in further chapters.
Chapter 2, Programming Concepts with Julia, gives an overview of the basic syntax of Julia and the programming concepts to get you up and running. This will explain concepts by giving examples of basic programming problems.
Chapter 3, Functions in Julia, takes you through creating functions in Julia. It will explain the importance of functions and best practices of function creation. Various types of functions will also be explained in this chapter.
Chapter 4, Understanding Types and Dispatch, explains in detail the type concept of Julia and how it is able to achieve the performance of statically typed languages. It will also explain powerful techniques to exploit the multiple dispatch provided by Julia.
Chapter 5, Working with Control Flow, explains how to structure the Julia program and different control structures to organize the execution of the code.
Chapter 6, Interoperability and Metaprogramming, explains how Julia provides different ways to interact with the operating system and other languages. Also, this chapter will explain expressions and macros.
Chapter 7, Numerical and Scientific Computation with Julia, explains what makes Julia suitable for numerical and scientific computing and the related features that Julia provides.
Chapter 8, Data Visualization and Graphics, explains with different examples the various sophisticated packages and methods to create beautiful visualizations in Julia.
Chapter 9, Connecting with Databases, deals with the interaction of Julia with databases. Most real-world applications use a database in the backend. It is important to understand how Julia interacts with different types of databases.
Chapter 10, Julia's Internals, provides details and explanations about the intricacies of Julia. It will also explain the standard packages available and networking with Julia. This chapter will also explain the process of creating a package in Julia and publishing it.
To execute the instructions and code in this book, you need to have a system with Julia installed on it. Detailed steps are given at the relevant instances in the book.
This book allows existing programmers, statisticians, and data scientists to learn Julia and benefit from it while building applications with complex numerical and scientific computations. Basic knowledge of mathematics is needed to understand the various methods that will be used or created in the book to exploit the capabilities for which Julia is made.
Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:
Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on the
SUPPORT
tab at the top.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on
Code Download
.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Learning-Julia. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/Learning-Julia_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.
Julia is a new programming language compared to other existing popular programming languages. Julia was presented publicly to the world and became open source in February of 2012. It all started in 2009, when three developers—Viral Shah, Stefan Karpinski, and Jeff Bezanson at the Massachusetts Institute of Technology (MIT), under the supervision of Professor Alan Edelman in the Applied Computing group—started working on a project. This lead to Julia. All of the principal developers are still actively involved with the JuliaLang. They are committed not just to the core language but to the different libraries that have evolved in its ecosystem. Julia is based on solid principles, which we will learn throughout the book. It is becoming more famous day by day, continuously gaining in the ranks of the TIOBE index (currently at 43), and gaining traction on Stack Overflow. Researchers are attracted to it, especially those from a scientific computing background.
Anyone can check the source code, which is available on GitHub (https://github.com/JuliaLang/julia). The current release at the time of writing this book is 0.6 with 633 contributors, 39,010 commits, and 9,398 stars on GitHub. Most of the core is written in Julia itself and there are a few chunks of code in C/C++, Lisp, and Scheme.
This chapter will take you through the installation and a basic understanding of all the necessary components of Julia. This chapter covers the following topics:
What makes Julia unique?
Installing Julia
Julia's importance in data science
Using REPL
Using Jupyter Notebook
What is Juno?
Package management
A brief about multiple dispatch
Understanding LLVM and JIT
Scientific computing requires the highest computing requirements. Over the years, the scientific community has used dynamic languages, which are comparatively much slower, to build their applications. A major reason for this is that applications are generally developed by physicists, biologists, financial experts, and other domain experts who, despite having experience with programming, are not seasoned developers. These experts always prefer dynamic languages over statically typed languages, which could have given them better performance, simply because they ease development and readability. However, there are now special packages to improve the performance of the code, such as Numba for Python. As the compiler techniques and language design has advanced, it is now possible to eliminate the trade-off between performance and dynamic prototyping. The requirement was to build a language, that is easy to read and code in, like Python, which is a dynamic language and gives the performance of C, which is a statically typed language. In 2012, a new language emerged—Julia. It is a general purpose programming language highly suited for scientific and technical computing. Julia's performance is comparable to C/C++ measured on the different benchmarks available on the JuliaLang's homepage and simultaneously provides an environment that can be used effectively for prototyping, like Python. Julia is able to achieve such performance because of its design and Low Level Virtual Machine (LLVM)-based just-in-time (JIT) compiler. These enable it to approach the performance of C and Fortran. We will be reading more about LLVM and JIT at the end of the chapter. The following quote is from the development team of Julia—the gist of why Julia was created (source: https://julialang.org/blog/2012/02/why-we-created-julia):
Julia is highly influenced by Python because of its readability and rapid prototyping capabilities, by R because of the support it gives to mathematical and statistical operations, by MATLAB (also GNU Octave) because of the vectorized numerical functions, especially matrices, and by some other languages too. Some of these languages have been in existence for more than 20 years now. Julia borrows ideologies from many of these languages and tries to bring the best of all these worlds together, and quietly succeeds too!
Julia is really good at scientific computing but is not restricted to just that, as it can also be used for web and general purpose programming. Julia's development team aims to create a remarkable and previously unseen combination of power and efficiency in one single language without compromising ease of use. Most of Julia's core is implemented in Julia. Julia's parser is written in Scheme. Julia's efficient and cross-platform I/O is provided by the libuv of Node.js.
Some of Julia's features are mentioned as follows:
It is designed for distributed and parallel computation.
Julia provides an extensive library of mathematical functions with great numerical accuracy.
Julia gives the functionality of multiple dispatch. It will be explained in detail in coming chapters. Multiple dispatch refers to using many combinations of argument types to define function behaviors. Julia provides efficient, specialized, and automatic generation of code for different argument types.
The
Pycall
package enables Julia to call Python functions in its code and MATLAB packages using the
MATLAB.jl
package. Functions and libraries written in C can also be called directly without any need for APIs or wrappers.
Julia provides powerful shell-like capabilities for managing other processes in the system.
Unlike other languages, user-defined types in Julia are compact and quite fast as built-ins.
Scientific computations makes great use of vectorized code to gain performance benefits. Julia eliminates the need to vectorize code to gain performance. De-vectorized code written in Julia can be as fast as the vectorized code.
It uses lightweight
green
threading, also known as tasks or coroutines, cooperative multitasking, or one-shot continuations.
Julia has a powerful type system. The conversions provided are elegant and extensible.
It has efficient support for Unicode.
It has facilities for metaprogramming and Lisp-like macros.
It has a built-in package manager (
Pkg
).
It's free and open source with an MIT license.
As mentioned earlier, Julia is open source and is available for free. It can be downloaded from the website at http://julialang.org/downloads/.
The website has links to documentation, tutorials, learning, videos, and examples. The documentation can be downloaded in popular formats, as shown in the following screenshot:
It is highly recommended to use the generic binaries for Linux provided on the julialang.org website.
As Ubuntu and Fedora are widely used Linux distributions, a few developers were kind enough to make the installation on these distributions easier by providing it through package manager. We will go through them in the following sections.
Ubuntu and its derived distributions are one of the most famous Linux distributions. Julia's deb packages (self-extracting binaries) are available on the website of JuliaLang, mentioned earlier. These are available for both 32-bit and 64-bit distributions. One can also add Personal Package Archive (PPA), which is treated as an apt repository to build and publish Ubuntu source packages. In the Terminal, type the following commands:
$ sudo apt-get add-repository ppa:staticfloat/juliareleases
$ sudo apt-get update
This adds the PPA and updates the package index in the repository. Now install Julia using the following command:
$ sudo apt-get install Julia
The installation is complete. To check whether the installation is successful in the Terminal, type the following:
$ julia --version
This gives the installed Julia's version:
$ julia version 0.5.0
To uninstall Julia, simply use apt to remove it:
$ sudo apt-get remove julia
For Fedora/RHEL/CentOS or distributions based on them, enable the EPEL repository for your distribution version. Then, click on the link provided. Enable Julia's repository using the following:
$ dnf copr enable nalimilan/julia
Or copy the relevant .repo file available at:
/etc/yum.repos.d/
Finally, in the Terminal type the following:
$ yum install julia
Go to the Julia download page (https://julialang.org/downloads/) and get the .exe file provided according to your system's architecture (32-bit/64-bit). The architecture can be found on the property settings of the computer. If it is amd64 or x86_64, then go for 64-bit binary (.exe), otherwise go for 32-bit binary. Julia is installed on Windows by running the downloaded .exe file, which will extract Julia into a folder. Inside this folder is a batch file called julia.exe, which can be used to start the Julia console.
Users with macOS need to click on the downloaded .dmg file to run the disk image. After that, drag the app icon into the Applications folder. It may prompt you to ask if you want to continue, as the source has been downloaded from the internet and so is not considered secure. Click on Continue if it was downloaded from the official Julia language website. Julia can also be installed using Homebrew on a Mac, as follows:
$ brew update
$ brew tap staticfloat/julia
$ brew install julia
The installation is complete. To check whether the installation is successful in the Terminal, type the following:
$ julia --version
This gives you the Julia version installed.
Building from source could be challenging for beginners. We assume that you are on Linux (Ubuntu) right now and are building from source. This provides the latest build of Julia, which may not be completely stable. Perform the following steps to build Julia from source:
On the downloads page of the Julia website, download the source. You can choose Tarball or GitHub. It is recommended to use GitHub. To clone the repo, GitHub must be installed on the machine. Otherwise, choose
Download as ZIP
. Here is the link:
https://github.com/JuliaLang/julia.git
.
To build Julia, it requires various compilers: g++, gfortran, and m4. We need to install them first, if not installed already, using the
$ sudo apt-get install gfortran g++ m4
command.
Traverse inside the Julia directory and start the make process, using the following command:
$ cd julia
$ make
On a successful build, Julia can be started up with the
./julia
command.
If you used GitHub to download the source, you can stay up to date by compiling the newest version using the following commands:
$ git pull
$ make clean
$ make
Building from source on Windows and macOS is also straightforward. It can be found at https://github.com/juliaLang/julia/.
Let's have a look at the directories and their content:
Directory
Contents
base/
Julia's standard library
contrib/
Miscellaneous set of scripts, configuration files
deps/
External dependencies
doc/src/manual
Source for user manual
doc/src/stdlib
Source for standard library function help text
examples/
Example Julia programs
src/
Source for Julia's language core
test/
Test suits
test/perf
Benchmark suits
ui/
Source for various frontends
A brief explanation about the directories mentioned earlier:
The
base/
directory consists of most of the standard library.
The
src/
directory contains the core of the language.
There is also an
examples
directory containing some good code examples, which can be helpful when learning Julia. It is highly recommended to use these in parallel.
On the successful build on Linux, these directories can be found in the Julia's folder. These are usually present in the build directory.
In the last decade, data science has become a buzzword, with Harvard Business Review naming it the sexiest job of the 21st century. What is a data scientist? The answer was published in The Guardian (https://www.theguardian.com/careers/2015/jun/30/whats-a-data-scientist-and-how-do-i-become-one):
The technical skills of a data scientist are varied but, generally, they are good at programming, and have a very strong background in mathematics—especially statistics, skills in machine learning, and knowledge of big data. A data scientist is required to have in-depth understanding of the domain he/she is working in. Julia was designed for scientific and numerical computation. And with the advent of big data, there is a requirement to have a language that can work on huge amounts of data. Although we have Spark and MapReduce (Hadoop) as processing engines that are generally used with Python, Scala, and Java, Julia with Intel's High Performance Analytics Toolkit can provide an alternative option. It may also be worth noting that Julia excels at parallel computing but is much easier to write and prototype than Spark/Hadoop.
One great feature of Julia is that it solves the 2-language problem. Generally, with Python and R, code that is doing most of the heavy workload is written in C/C++ and it is then called. This is not required with Julia, as it can perform comparably to C/C++. Therefore, complete code—including code that does heavy computations—can be written in Julia itself.
Read-Eval-Print-Loop (REPL) is an interactive shell or the language shell that provides the functionality to test out pieces of code. Julia provides an interactive shell with a JIT compiler (used by Julia) at the backend. We can give input in a line, it is compiled and evaluated, and the result is given in the next line:
Julia's shell can be started easily, just by writing Julia in the Terminal. This brings up the logo and information about the version. This julia>
