39,59 €
If you work with data in Python and are looking to create data apps that showcase ML models and make beautiful interactive visualizations, then this is the ideal book for you. Streamlit for Data Science, Second Edition, shows you how to create and deploy data apps quickly, all within Python. This helps you create prototypes in hours instead of days!
Written by a prolific Streamlit user and senior data scientist at Snowflake, this fully updated second edition builds on the practical nature of the previous edition with exciting updates, including connecting Streamlit to data warehouses like Snowflake, integrating Hugging Face and OpenAI models into your apps, and connecting and building apps on top of Streamlit databases. Plus, there is a totally updated code repository on GitHub to help you practice your newfound skills.
You'll start your journey with the fundamentals of Streamlit and gradually build on this foundation by working with machine learning models and producing high-quality interactive apps. The practical examples of both personal data projects and work-related data-focused web applications will help you get to grips with more challenging topics such as Streamlit Components, beautifying your apps, and quick deployment.
By the end of this book, you'll be able to create dynamic web apps in Streamlit quickly and effortlessly.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 305
Veröffentlichungsjahr: 2023
Streamlit for Data Science
Second Edition
Create interactive data apps in Python
Tyler Richards
BIRMINGHAM—MUMBAI
Streamlit for Data Science
Second Edition
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Bhavesh Amin
Acquisition Editor: Peer Reviews: Gaurav Gavas
Project Editor: Amisha Vathare
Content Development Editor: Rebecca Robinson
Copy Editor: Safis Editing
Technical Editor: Aniket Shetty
Proofreader: Safis Editing
Indexer: Tejal Daruwale Soni
Presentation Designer: Ganesh Bhadwalkar
Developer Relations Marketing Executive: Monika Sangwan
First published: August 2021
Second edition: September 2023
Production reference: 1210923
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80324-822-6
www.packt.com
I remember a CS professor of mine pointing out that most of the magic in Harry Potter can now be done on computers! Images dance on our digital newspapers. Cellphones swirl with memories like portable Pensieves. Computer classes are our Charms. Algorithms are our Arithmancy!
If computing departments are the new Hogwarts, then technical tomes are the new spell books. The best works brim with technical secrets and arcana and represent a totem to some branch of our magical field: Python. Algorithms. Visualization. Machine learning.
I’m therefore particularly excited and proud to share that the canonical Streamlit book, Streamlit for Data Science, has a major new version, lovingly written by one of our own, a previous Streamlit Creator and now a Streamlit data scientist, Tyler Richards.
This is a true spell book. Yes, other books teach Streamlit, but this is the first that captures the essence of Streamlit. This book demonstrates how Streamlit is transforming the very definition of data science and machine learning.
Throughout the 2010s, data science and machine learning had two basic outputs. On the one hand, you could use a notebook environment to create static analyses. On the other, you could deploy complete machine learning models into production. Streamlit opened up a new middle way between these two: interactive apps that let you play with analyses and share models interactively throughout an organization.
Streamlit for Data Science teaches you how to master this new superpower. You start by creating a basic analysis and work your way up to complete Streamlit apps with fancy graphics and interactive machine learning models. You even learn how to use LLMs like OpenAI’s GPT series!
So read on! Learn the deep secrets of Streamlit. Join our magical community. Share your apps with the world. Contribute to our gallery. Or invent your own spells with custom Components. Whether you’re a wizard-in-training looking to deploy your first machine learning project or an experienced auror, this book will turn you into a Streamlit sorcerer.
Adrien TreuilleStreamlit Co-Founder
Tyler Richards is a data scientist at Snowflake, working on Streamlit-related projects. He joined Snowflake through the Streamlit acquisition in the Spring of 2022. Before Snowflake, his focus was on integrity measurement at Facebook (Meta), along with helping bolster the state of US elections for the nonprofit Protect Democracy. He is a data scientist and industrial engineer by training and spends his free time applying data science in fun ways, such as applying machine learning to local campus elections, creating algorithms to help P&G target Tide Pod users, and finding ways to determine the best ping pong players in friend groups. You can find out more at https://www.tylerjrichards.com/.
Chanin Nantasenamat, Ph.D. is a developer advocate, YouTuber, and ex-professor with a passion for data science, bioinformatics, and content creation. After earning a B.Sc. (biomedical science) and Ph.D. (medical technology) from Mahidol University, his academic career started in 2006, and he was appointed a full professor of bioinformatics in 2018. He pioneered the use of data science and bioinformatics at Mahidol University through courses, research, mentorship, and as founding head of the Center of Data Mining and Biomedical Informatics (2013-2021). He has published more than 170 peer-reviewed research articles in the fields of biology, chemistry, and informatics. In 2021, he pivoted to tech and joined Streamlit, later acquired by Snowflake, where he works as a senior developer advocate. In his free time, he creates educational videos about data science and bioinformatics on YouTube as the Data Professor, with his channel having over 162,000 subscribers.
To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:
https://packt.link/sl
Preface
Who this book is for
What this book covers
Acknowledgment
To get the most out of this book
Get in touch
An Introduction to Streamlit
Technical requirements
Why Streamlit?
Installing Streamlit
Organizing Streamlit apps
Streamlit plotting demo
Making an app from scratch
Using user input in Streamlit apps
Finishing touches – adding text to Streamlit
Summary
Uploading, Downloading, and Manipulating Data
Technical requirements
The setup – Palmer’s Penguins
Exploring Palmer’s Penguins
Flow control in Streamlit
Debugging Streamlit apps
Developing in Streamlit
Exploring in Jupyter and then copying to Streamlit
Data manipulation in Streamlit
An introduction to caching
Persistence with Session State
Summary
Data Visualization
Technical requirements
San Francisco Trees – a new dataset
Streamlit visualization use cases
Streamlit’s built-in graphing functions
Streamlit’s built-in visualization options
Plotly
Matplotlib and Seaborn
Bokeh
Altair
PyDeck
Configuration options
Summary
Machine Learning and AI with Streamlit
Technical requirements
The standard ML workflow
Predicting penguin species
Utilizing a pre-trained ML model in Streamlit
Training models inside Streamlit apps
Understanding ML results
Integrating external ML libraries – a Hugging Face example
Integrating external AI libraries – an OpenAI example
Authenticating with OpenAI
OpenAI API cost
Streamlit and OpenAI
Summary
Deploying Streamlit with Streamlit Community Cloud
Technical requirements
Getting started with Streamlit Community Cloud
A quick primer on GitHub
Deploying with Streamlit Community Cloud
Debugging Streamlit Community Cloud
Streamlit Secrets
Summary
Beautifying Streamlit Apps
Technical requirements
Setting up the SF Trees dataset
Working with columns in Streamlit
Exploring page configuration
Using Streamlit tabs
Using the Streamlit sidebar
Picking colors with a color picker
Multi-page apps
Editable DataFrames
Summary
Exploring Streamlit Components
Technical requirements
Adding editable DataFrames with streamlit-aggrid
Creating drill-down graphs with streamlit-plotly-events
Using Streamlit Components – streamlit-lottie
Using Streamlit Components – streamlit-pandas-profiling
Interactive maps with st-folium
Helpful mini-functions with streamlit-extras
Finding more Components
Summary
Deploying Streamlit Apps with Hugging Face and Heroku
Technical requirements
Choosing between Streamlit Community Cloud, Hugging Face, and Heroku
Deploying Streamlit with Hugging Face
Deploying Streamlit with Heroku
Setting up and logging in to Heroku
Cloning and configuring our local repository
Deploying to Heroku
Summary
Connecting to Databases
Technical requirements
Connecting to Snowflake with Streamlit
Connecting to BigQuery with Streamlit
Adding user input to queries
Organizing queries
Summary
Improving Job Applications with Streamlit
Technical requirements
Using Streamlit for proof-of-skill data projects
Machine learning – the Penguins app
Visualization – the Pretty Trees app
Improving job applications in Streamlit
Questions
Answering Question 1
Answering Question 2
Summary
The Data Project – Prototyping Projects in Streamlit
Technical requirements
Data science ideation
Collecting and cleaning data
Making an MVP
How many books do I read each year?
How long does it take for me to finish a book that I have started?
How long are the books that I have read?
How old are the books that I have read?
How do I rate books compared to other Goodreads users?
Iterative improvement
Beautification via animation
Organization using columns and width
Narrative building through text and additional statistics
Hosting and promotion
Summary
Streamlit Power Users
Fanilo Andrianasolo
Adrien Treuille
Gerard Bentley
Arnaud Miribel and Zachary Blackwood
Yuichiro Tachibana
Summary
Other Books You May Enjoy
Index
Cover
Index
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link belowhttps://packt.link/free-ebook/9781803248226
Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directly