Julia Cookbook - Jalem Raj Rohit - E-Book

Julia Cookbook E-Book

Jalem Raj Rohit

0,0
33,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Over 40 recipes to get you up and running with programming using Julia

About This Book

  • Follow a practical approach to learn Julia programming the easy way
  • Get an extensive coverage of Julia's packages for statistical analysis
  • This recipe-based approach will help you get familiar with the key concepts in Juli

Who This Book Is For

This book is for data scientists and data analysts who are familiar with the basics of the Julia language. Prior experience of working with high-level languages such as MATLAB, Python, R, or Ruby is expected.

What You Will Learn

  • Extract and handle your data with Julia
  • Uncover the concepts of metaprogramming in Julia
  • Conduct statistical analysis with StatsBase.jl and Distributions.jl
  • Build your data science models
  • Find out how to visualize your data with Gadfly
  • Explore big data concepts in Julia

In Detail

Want to handle everything that Julia can throw at you and get the most of it every day? This practical guide to programming with Julia for performing numerical computation will make you more productive and able work with data more efficiently. The book starts with the main features of Julia to help you quickly refresh your knowledge of functions, modules, and arrays. We'll also show you how to utilize the Julia language to identify, retrieve, and transform data sets so you can perform data analysis and data manipulation.

Later on, you'll see how to optimize data science programs with parallel computing and memory allocation. You'll get familiar with the concepts of package development and networking to solve numerical problems using the Julia platform.

This book includes recipes on identifying and classifying data science problems, data modelling, data analysis, data manipulation, meta-programming, multidimensional arrays, and parallel computing. By the end of the book, you will acquire the skills to work more effectively with your data.

Style and approach

This book has a recipe-based approach to help you grasp the concepts of Julia programming.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 138

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Julia Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it…
How it works…
There's more…
See also
Conventions
Reader feedback
Customer support
Errata
Piracy
Questions
1. Extracting and Handling Data
Introduction
Why should we use Julia for data science?
Handling data with CSV files
Getting ready
How to do it...
Handling data with TSV files
Getting ready
How to do it...
Working with databases in Julia
Getting ready
How to do it...
MySQL
PostgreSQL
There's more...
MySQL
PostgreSQL
SQLite
Interacting with the Web
Getting ready
How to do it...
GET request
There's more...
2. Metaprogramming
Introduction
Representation of a Julia program
Getting ready
How to do it...
How it works...
There's more
Symbols and expressions
Symbols
Getting ready
How to do it...
How it works...
There's more
Quoting
How to do it...
How it works...
Interpolation
How to do it...
How it works...
There's more
The Eval function
Getting ready
How to do it...
How it works...
Macros
Getting ready
How to do it...
How it works...
Metaprogramming with DataFrames
Getting ready
How to do it...
How it works...
3. Statistics with Julia
Introduction
Basic statistics concepts
Getting ready
How to do it...
How it works...
Descriptive statistics
Getting ready
How to do it...
How it works...
Deviation metrics
Getting ready
How to do it...
How it works...
Sampling
Getting ready
How to do it...
How it works...
Correlation analysis
Getting ready
How to do it...
How it works...
4. Building Data Science Models
Introduction
Dimensionality reduction
Getting ready
How to do it...
How it works...
Linear discriminant analysis
Getting ready
How to do it...
How it works...
Data preprocessing
Getting ready
How to do it...
How it works...
Linear regression
Getting ready
How to do it...
How it works...
Classification
Getting ready
How to do it...
How it works...
Performance evaluation and model selection
Getting ready
How to do it...
How it works...
Cross validation
Getting ready
How to do it...
How it works...
Distances
Getting ready
How to do it...
How it works...
Distributions
Getting ready
How to do it...
How it works...
Time series analysis
Getting ready
How to do it...
How it works...
5. Working with Visualizations
Introduction
Plotting basic arrays
Getting ready
How to do it...
How it works...
Plotting dataframes
Getting ready
How to do it...
How it works...
Plotting functions
Getting ready
How to do it...
How it works...
Exploratory data analytics through plots
Getting ready
How to do it...
How it works...
Line plots
Getting ready
How to do it...
How it works...
Scatter plots
Getting ready
How to do it...
How it works...
Histograms
Getting ready
How to do it...
How it works...
Aesthetic customizations
Getting ready
How to do it...
How it works…
6. Parallel Computing
Introduction
Basic concepts of parallel computing
Getting ready
How to do it...
How it works...
Data movement
Getting ready
How to do it...
How it works...
Parallel maps and loop operations
Getting ready
How to do it...
How it works...
Channels
Getting ready
How to do it...

Julia Cookbook

Julia Cookbook

Copyright © 2016 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September 2016

Production reference: 1260916

Published by Packt Publishing Ltd. 

Livery Place 

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78588-201-2

www.packtpub.com

Credits

Author

Jalem Raj Rohit

Copy Editor

Pranjali Chury

Reviewer

Jakub Glinka

Project Coordinator

Izzat Contractor

Commissioning Editor

Pratik Shah

Proofreader

Safis Editing

Acquisition Editor

Denim Pinto

Indexer

Tejal Daruwale Soni

Content Development Editor

Rohit Singh

Production Coordinator

Aparna Bhagat

Technical Editor

Abhishek R. Kotian

Cover Work

Aparna Bhagat

About the Author

Jalem Raj Rohit is an IIT Jodhpur graduate with a keen interest in machine learning, data science, data analysis, computational statistics, and natural language processing (NLP). Rohit currently works as a senior data scientist at Zomato, also having worked as the first data scientist at Kayako.

He is part of the Julia project, where he develops data science models and contributes to the codebase. Additionally, Raj is also a Mozilla contributor and volunteer, and he has interned at Scimergent Analytics.

I would thank my parents and my family for all their support and encouragement, which helped me make this book possible.

About the Reviewer

Jakub Glinka is a mathematician, programmer, and data scientist.

He holds a master's degree in applied mathematics from Warsaw University with a specialization in mathematical statistics.

From the beginning of his professional career, he is associated with GfK. His area of expertise ranges from Bayesian modeling to machine learning. He is enthusiastic about new programming languages and currently relying heavily on R and Julia in his professional work.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com , you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Free access for Packt account holders

Get notified! Find out when new books are published by following @PacktEnterprise on Twitter or the Packt Enterprise Facebook page.

Preface

Julia is a programming language that promises both speed and support for extensive data science applications. Apart from the official documentation of the language, and the individual documentations for each package, there is no single resource that combines all of them and provides a detailed guide to carry out machine learning and data science. So, this book aims to solve the problem by being a comprehensive guide to learning data science for a Julia programmer, right from the exploratory analytics part to the visualization part.

What this book covers

Chapter 1, Extracting and Handling Data, deals with the importance of the Julia programming language for data science and its applications. It also serves as a guide to handle data in the most available formats, and shows how to crawl and scrape data from the Internet.

Chapter 2, Metaprogramming, covers the concept of metaprogramming, where a language can express its own code as a data structure of itself. For example, Lisp expresses code in the form of Lisp arrays, which are data structures in Lisp itself. Similarly, Julia can express its code as data structures.

Chapter 3, Statistics with Julia, teaches you how to perform statistics in Julia, along with the common problems of handling data arrays, distributions, estimation, and sampling techniques.

Chapter 4, Building Data Science Models, talks about various data science and statistical models. You will learn to design, customize, and apply them to various data science problems. This chapter will also teach you about model selection and the ways to learn how to build and understand robust statistical models.

Chapter 5, Working with Visualizations, teaches you how to visualize and present data, and also to analyze and the findings from the data science approach that you have taken to solve a particular problem. There are various types of visualizations to display your findings, namely the bar plot, the scatter plot, pie chart, and so on. It is very important to choose an appropriate method that can reflect your findings and work in a sensible and an aesthetically pleasing manner.

Chapter 6, Parallel Computing, talks about the concepts of parallel computing and handling a lot of data in Julia.

What you need for this book

A beginner level proficiency in the Julia programming language and experience with any programming language, preferably dynamically typed ones such as Python. The software requirements assume you have any of the following OSes: Linux, Windows, or OS X. There are no specific hardware requirements, except that you run and work all your code on a desktop, or a laptop preferably.

Who this book is for

This book is for beginner-level programmers, preferably Julia programmers who are looking to explore and learn the concepts in the domain of data science.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it…, How it works…, There's more…, and See also).

To give clear instructions on how to complete a recipe, we use these sections as follows:

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it…

This section contains the steps required to follow the recipe.

How it works…

This section usually consists of a detailed explanation of what happened in the previous section.

There's more…

This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Chapter 1. Extracting and Handling Data

In this chapter, we will cover the following recipes:

Why should we use Julia for data science?Handling data with CSV filesHandling data with TSV filesWorking with databases in JuliaInteracting with the Web

Introduction

This chapter deals with the importance of the Julia programming language for data science and its applications. It also serves as a guide to handling data in the most available formats and also shows how to crawl and scrape data from the Internet.

Data Science pipelines that are used for production purposes need to be robust and highly fault-tolerant, without which the teams would be exposed highly error-prone models. So, these pipelines contain a subprocess called Extract-Transform-Load (ETL), in which the Extraction step involves pulling the data from a source, the Transform step involves the transforms performed on the dataset as part of the cleansing process, and the Load step is about loading the now clean data into the local databases for use in production. This will chapter will also teach you how to interact with websites by sending and receiving data through HTTP requests. This would be the first step in any data science and analytics pipeline. So, this chapter will cover some of those methods through which data can be ingested into the pipeline through various data sources.

Why should we use Julia for data science?

Now, you are all set up to learn and experience Julia for data science.

Data Science is simply doing science with data. It applies to a surprisingly wide range of domains, such as engineering, business, marketing, and automotive, owing to the availability of a large amount of data in all these industries from which valuable insights can be extracted and understood.

With the growth of industries, the speed, volume, and variety of the data being produced are drastically increasing. And the tools that have to deal with this data are continuously being adapted, which led to the emergence of more evolved, powerful tools such as Julia.

Julia has been growing steadily as a powerful alternative to the current data science tools. Julia's diverse range of statistical packages along with its powerful compiler features make it a very strong competitor to the current top two programming languages of data science: R and Python. However, advanced users of R and Python can use Julia alongside each of them to reap the maximum benefits from the features of both.