31,19 €
Learn to use Scala to build a recommendation engine from scratch and empower your website users
This book is written for those who want to learn the different tools in the Scala ecosystem to build a recommendation engine. No prior knowledge of Scala or recommendation engines is assumed.
With an increase of data in online e-commerce systems, the challenges in assisting users with narrowing down their search have grown dramatically. The various tools available in the Scala ecosystem enable developers to build a processing pipeline to meet those challenges and create a recommendation system to accelerate business growth and leverage brand advocacy for your clients.
This book provides you with the Scala knowledge you need to build a recommendation engine.
You'll be introduced to Scala and other related tools to set the stage for the project and familiarise yourself with the different stages in the data processing pipeline, including at which stages you can leverage the power of Scala and related tools. You'll also discover different machine learning algorithms using MLLib.
As the book progresses, you will gain detailed knowledge of what constitutes a collaborative filtering based recommendation and explore different methods to improve users' recommendation.
A step-by-step guide full of real-world, hands-on examples of Scala recommendation engines. Each example is placed in context with explanation and visuals.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 156
Veröffentlichungsjahr: 2016
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: January 2016
Production reference: 1281215
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78528-258-4
www.packtpub.com
Author
Saleem Ansari
Reviewers
Eric Le Goff
Andrii Kravets
Loránd Szakács
Commissioning Editor
Nadeem Bagban
Acquisition Editor
Vinay Argekar
Content Development Editor
Zeeyan Pinheiro
Technical Editor
Siddhi Rane
Copy Editor
Ting Baker
Project Coordinator
Suzanne Coutinho
Proofreader
Safis Editing
Indexer
Rekha Nair
Graphics
Kirk D'Penha
Production Coordinator
Manu Joseph
Cover Work
Manu Joseph
Saleem Ansari is a full-stack developer with over 8 years of industry experience. He has a special interest in machine learning and information retrieval. Having implemented data ingestion and a processing pipeline in Core Java and Ruby separately, he knows the challenges faced by huge data sets in such systems. He has worked for companies such as Red Hat, Impetus Technologies, Belzabar Software, and Exzeo Software. He is also a passionate member of free and open source software (FOSS) community. He started his journey with FOSS in the year 2004. The very next year, he formed JMILUG—Linux Users Group at Jamia Millia Islamia University, New Delhi. Since then, he has been contributing to FOSS by organizing community activities and contributing code to various projects (for more information, visit http://github.com/tuxdna). He also mentors students about FOSS and its benefits.
In 2015, he reviewed two books related to Apache Mahout, namely Learning Apache Mahout and Apache Mahout Essentials; both the books were produced by Packt Publishing.
He blogs at http://tuxdna.in/ and can be reached at <[email protected]> via e-mail.
I dedicate this book to my parents.
I would like to acknowledge the amazing people who have helped me push forward while writing this book. First off, I would like to thank Vinay Argekar and Zeeyan Pinheiro from Packt Publishing, who have been of immense help and guidance right from the beginning of this book. I would like to especially thank the reviewers, Eric Le Goff and Andrii Kravets. I wouldn't have leveled up the content if I had not received their critical reviews and suggestions. So much kudos to you guys! I would like to give another special mention to Pat Ferrel from the Apache Mahout and PredictionIO project. He helped me understand the unified recommender algorithm that is mentioned in the book.
All the appreciations are due to my family and friends, who have been supportive while I was writing this book.
Eric Le Goff is a senior developer and an open source evangelist. Located in Bordeaux, France, he has more than 15 years of experience in large-scale system designs and server-side developments in both start-ups and established corporations, from digital signature solutions to financial institutions and risk management.
A former board member at the OW2 consortium (an international open source community for infrastructure), he is also a Scala enthusiast with Coursera certifications such as Functional Programming Principles in Scala and Principles of Reactive Programming.
He is also passionate about NoSql solutions (M101J: MongoDB for Java developers certified).
He has reviewed the book Scala for Java Developers, Packt Publishing.
First, thanks goes to my wife, Corine, who constantly supports everything that I undertake. I also would like to include all the contributors and the open source community at large. Finally, I'd like to thank Martin Odersky and his team for creating Scala.
Andrii Kravets is a highly motivated, agile-minded engineer with more than 5 years of experience in software development and software project management who wants to make the world better. He has a lot of experience with high-loaded distributed projects, big data, JVM languages, machine learning, and building complex web solutions.
He is currently making the world better at TransferWise.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
With the growth of the Internet and the widespread adoption of e-commerce and social media, a lot of new services have arrived in recent years. We shop online, we communicate online, we stay up-to-date online, and so on. We have a huge growth of data, and this has made it increasingly tough for service providers to provide only the relevant data. Recommendation engines help us provide only the relevant data to a consumer.
In this book, we will use the Scala programming language and the many tools that are available in its ecosystem, such as Apache Spark, Play Framework, Spray, Kafka, PredictionIO, to build a recommendation engine. We will reach that stage step by step with a real world dataset and a fully functional application that gives readers a hands-on experience. We have discussed the key topics in detail for readers to get started on their own. You will learn the challenges and approaches used to build a recommendation engine.
You must have some understanding of the Scala programming language, SBT, and command-line tools. An understanding of different machine learning and data processing concepts is beneficial but not required. You will learn the tools necessary for writing data-munging programs and experimenting using Scala.
Chapter 1, Introduction to Scala and Machine Learning, is a fast-paced introduction to Scala, SBT, Spark, MLlib, and other related tools. We basically set the stage for the upcoming experiments.
Chapter 2, Data Processing Pipeline Using Scala, explores ways to compose a data processing pipeline using Scala. We do this by taking a sample dataset from the recommendation system and then building the pipeline.
Chapter 3, Conceptualizing an E-Commerce Store, discusses the need for a recommendation engine. We discuss different ways in which we can present recommendations; we will also explore the architecture of our project.
Chapter 4, Machine Learning Algorithms, discusses some machine learning algorithms that are relevant while building different aspects of a recommender system. We will also have hands-on exercises dealing with Apache Spark's MLlib library.
Chapter 5, Recommendation Engines and Where They Fit in?, implements our first recommender system on a dataset for products. We will continue by populating the dataset, creating a web application, and adding recommendation pages and product/customer trends.
Chapter 6, Collaborative Filtering versus Content-Based Recommendation Engines, focuses on tuning the recommendations that are user-specific, rather than being global in nature. We will implement the content-based recommendation and collaborative filtering-based recommendations. Then, we will compare these two approaches.
Chapter 7, Enhancing the User Experience, discusses some tricks that add more spice to the overall user experience. We will add product search and recommendations listing and also discuss recommendation behavior.
Chapter 8, Learning from User Feedback, discusses a case study of PredictionIO. We will have a look at a hybrid recommender called unified recommender that is implemented using PredictionIO.
Before you start reading this book, ensure that you have all the necessary software installed. The prerequisites for this book are as follows:
This book is intended for those developers who are keen on understanding how a recommender system is built from scratch. It is assumed that you have a basic understanding of the Scala programming language and you can also handle regular data-munging tasks.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.
