Mastering Exploratory Analysis with pandas - Harish Garg - E-Book

Mastering Exploratory Analysis with pandas E-Book

Harish Garg

0,0
21,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Explore Python frameworks like pandas, Jupyter notebooks, and Matplotlib to build data pipelines and data visualization

Key Features

  • Learn to set up data analysis pipelines with pandas and Jupyter notebooks
  • Effective techniques for data selection, manipulation, and visualization
  • Introduction to Matplotlib for interactive data visualization using charts and plots

Book Description

The pandas is a Python library that lets you manipulate, transform, and analyze data. It is a popular framework for exploratory data visualization and analyzing datasets and data pipelines based on their properties.

This book will be your practical guide to exploring datasets using pandas. You will start by setting up Python, pandas, and Jupyter Notebooks. You will learn how to use Jupyter Notebooks to run Python code. We then show you how to get data into pandas and do some exploratory analysis, before learning how to manipulate and reshape data using pandas methods. You will also learn how to deal with missing data from your datasets, how to draw charts and plots using pandas and Matplotlib, and how to create some effective visualizations for your audience. Finally, you will wrapup your newly gained pandas knowledge by learning how to import data out of pandas into some popular file formats.

By the end of this book, you will have a better understanding of exploratory analysis and how to build exploratory data pipelines with Python.

What you will learn

  • Learn how to read different kinds of data into pandas DataFrames for data analysis
  • Manipulate, transform, and apply formulas to data imported into pandas DataFrames
  • Use pandas to analyze and visualize different kinds of data to gain real-world insights
  • Extract transformed data form pandas DataFrames and convert it into the formats your application expects
  • Manipulate model time-series data, perform algorithmic trading, derive results on fixed and moving windows, and more
  • Effective data visualization using Matplotlib

Who this book is for

If you are a budding data scientist looking to learn the popular pandas library, or a Python developer looking to step into the world of data analysis, this book is the ideal resource you need to get started. Some programming experience in Python will be helpful to get the most out of this course

Harish Garg is a data analyst, author, and software developer who is really passionate about data science and Python. He is a graduate of Udacity's Data Analyst Nanodegree program. He has 17 years of industry experience in data analysis using Python, developing and testing enterprise and consumer software, managing projects and software teams, and creating training material and tutorials. He also worked for 11 years for Intel Security (previously McAfee, Inc.). He regularly contributes articles and tutorials on data analysis and Python. He is also active in the open data community and is a contributing member of the Data4Democracy open data initiative. He has written data analysis pieces for the Takshashila think tank.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 87

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Mastering Exploratory Analysis with pandas

 

 

 

 

 

 

 

Build an end-to-end data analysis workflow with Python

 

 

 

 

 

 

 

 

 

 

 

Harish Garg

 

 

 

 

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Mastering Exploratory Analysis with pandas

Copyright © 2018 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Pavan RamchandaniAcquisition Editor:Nelson MorrisContent Development Editor: Karan ThakkarTechnical Editor: Suwarna PatilCopy Editor: Safis EditingProject Coordinator: Nidhi JoshiProofreader: Safis EditingIndexer: Pratik ShirodkarGraphics: Jisha ChirayilProduction Coordinator: Arvindkumar Gupta

First published: September 2018

Production reference: 1290918

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78961-963-8

www.packtpub.com

 
mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. 

Contributors

About the author

Harish Garg is a data analyst, author, and software developer who is really passionate about data science and Python. He is a graduate of Udacity's Data Analyst Nanodegree program. He has 17 years of industry experience in data analysis using Python, developing and testing enterprise and consumer software, managing projects and software teams, and creating training material and tutorials. He also worked for 11 years for Intel Security (previously McAfee, Inc.). He regularly contributes articles and tutorials on data analysis and Python. He is also active in the open data community and is a contributing member of the Data4Democracy open data initiative. He has written data analysis pieces for the Takshashila think tank.

 

 

 

 

 

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Mastering Exploratory Analysis with pandas

Packt Upsell

Why subscribe?

Packt.com

Contributors

About the author

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Working with Different Kinds of Datasets

Using advanced options while reading data from CSV files

Importing modules

Advanced read options

Manipulating columns, index locations, and names

Specifying a different row as a header

Specifying a column as an index

Choosing a subset of columns to be read

Handling missing and NA data

Choosing whether to skip over blank rows

Data parsing options

Skipping rows from the footer or end of the file

Reading the subset of a file or a certain number of rows

Reading data from Excel files

Basic Excel read

Specifying which sheet should be read

Reading data from multiple sheets

Finding out sheet names

Choosing header or column labels

No header

Skipping rows at the beginning

Skipping rows at the end

Choosing columns

Column names

Setting an index while reading data

Handling missing data while reading

Reading data from other popular formats

Reading a JSON file

Reading JSON data into pandas

Reading HTML data

Reading a PICKLE file

Reading SQL data

Reading data from the clipboard

Summary

Data Selection

Introduction to datasets

Selecting data from the dataset

Multi-column selection

Dot notation

Selecting multiple rows and columns from a pandas DataFrame

Selecting a single row and multiple columns

Selecting values from a range of rows and all columns

Sorting a pandas DataFrame

Filtering rows of a pandas DataFrame

Applying multiple filter criteria to a pandas DataFrame

Filtering based on multiple conditions – AND

Filtering based on multiple conditions – OR

Filtering using the isin method

Using the isin method with multiple conditions

Using the axis parameter in pandas

Usage of the axis parameter

Axis usage examples

More examples of the axis keyword

The axis keyword

Using string methods in pandas

Checking for a substring

Changing the values of a series or column into uppercase

Changing the values into lowercase

Finding the length of every value of a column

Removing white spaces

Replacing parts of a column's values

Changing the datatype of a pandas series

Changing an int datatype column to a float

Changing the datatype while reading data

Converting string to datetime

Summary

Manipulating, Transforming, and Reshaping Data

Modifying a pandas DataFrame using the inplace parameter

Using the groupby method

Handling missing values in pandas

Indexing in pandas DataFrames

Renaming columns in a pandas DataFrame

Removing columns from a pandas DataFrame

Working with date and time series data

Handling SettingWithCopyWarning

Applying a function to a pandas series or DataFrame

Merging and concatenating multiple DataFrames into one

Summary

Visualizing Data Like a Pro

Controlling plot aesthetics

Our first plot with seaborn

Changing the plot style with set_style

Setting the plot background to a white grid

Setting the plot background to dark

Setting the background to white

Adding ticks

Customizing styles

Style parameters

Plotting context presets

Choosing the colors for plots

Changing the color palette

Building custom color palettes

Plotting categorical data

Scatterplot

Swarm plot

Box plot

Violin plot

Bar plot

Wide-form plot

Plotting with Data-Aware Grids

Plotting with the FacetGrid() method

Plotting with the PairGrid() method 

Plotting with the PairPlot() method 

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

In this book, you will be learning in depth about pandas, which is a Python library for manipulating, transforming, and analyzing  data. It is a popular framework for exploratory data visualization, which is a method for analyzing datasets and data pipelines based on their properties.

This book will be your practical guide to exploring datasets using pandas. You will start by setting up Python, pandas, and Jupyter Notebooks. You will learn how to use Jupyter Notebooks to run Python code. We will then show you how to get data into pandas and perform some exploratory analysis. You will learn how to manipulate and reshape data using pandas methods. You will also learn how to deal with missing data from your datasets, how to draw charts and plots using pandas and Matplotlib, and how to create some effective visualizations for your audience. Finally, we will wrap up your newly gained pandas knowledge by teaching you how to get data out of pandas and into a number of popular file formats. 

Who this book is for

This book is for the budding data scientist looking to learn about the popular pandas library, or the Python developer looking to step into the world of data analysis—if you fall into either of those categories, then this book is the ideal resource for you to get started.

What this book covers

Chapter 1, Working with Different Kinds of Datasets, teaches you about using advanced options when reading data from CSV files and Excel files.

Chapter 2, Data Selection, looks at how to use the pandas series data structure to select data. You will also learn how to sort and filter data from pandas DataFrames and how to change datatypes in pandas series.

Chapter 3, Manipulating, Transforming, and Reshaping Data, explores how to modify pandas DataFrames. You will also learn how to use the GroupBy method, how to handle missing values, and how to index methods in pandas DataFrames. This chapter will also teach you how to work with dates and time data and how to apply functions to pandas series or DataFrames.

Chapter 4, Visualizing Data Like a Pro, will show you how to control plot aesthetics, including how to choose colors for plots. You will also learn how to plot categorical data and get to grips with plotting with data-aware grids.