Building Recommendation Engines - Suresh Kumar Gorakala - E-Book

Building Recommendation Engines E-Book

Suresh Kumar Gorakala

0,0
41,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Understand your data and user preferences to make intelligent, accurate, and profitable decisions

About This Book

  • A step-by-step guide to building recommendation engines that are personalized, scalable, and real time
  • Get to grips with the best tool available on the market to create recommender systems
  • This hands-on guide shows you how to implement different tools for recommendation engines, and when to use which

Who This Book Is For

This book caters to beginners and experienced data scientists looking to understand and build complex predictive decision-making systems, recommendation engines using R, Python, Spark, Neo4j, and Hadoop.

What You Will Learn

  • Build your first recommendation engine
  • Discover the tools needed to build recommendation engines
  • Dive into the various techniques of recommender systems such as collaborative, content-based, and cross-recommendations
  • Create efficient decision-making systems that will ease your work
  • Familiarize yourself with machine learning algorithms in different frameworks
  • Master different versions of recommendation engines from practical code examples
  • Explore various recommender systems and implement them in popular techniques with R, Python, Spark, and others

In Detail

A recommendation engine (sometimes referred to as a recommender system) is a tool that lets algorithm developers predict what a user may or may not like among a list of given items. Recommender systems have become extremely common in recent years, and are applied in a variety of applications. The most popular ones are movies, music, news, books, research articles, search queries, social tags, and products in general.

The book starts with an introduction to recommendation systems and its applications. You will then start building recommendation engines straight away from the very basics. As you move along, you will learn to build recommender systems with popular frameworks such as R, Python, Spark, Neo4j, and Hadoop. You will get an insight into the pros and cons of each recommendation engine and when to use which recommendation to ensure each pick is the one that suits you the best.

During the course of the book, you will create simple recommendation engine, real-time recommendation engine, scalable recommendation engine, and more. You will familiarize yourselves with various techniques of recommender systems such as collaborative, content-based, and cross-recommendations before getting to know the best practices of building a recommender system towards the end of the book!

Style and approach

This book follows a step-by-step practical approach where users will learn to build recommendation engines with increasing complexity in every chapter

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 297

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Building Recommendation Engines
Credits
About the Author
About the Reviewers
www.PacktPub.com
Why subscribe?
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Introduction to Recommendation Engines
Recommendation engine definition
Need for recommender systems
Big data driving the recommender systems
Types of recommender systems
Collaborative filtering recommender systems
Content-based recommender systems
Hybrid recommender systems
Context-aware recommender systems
Evolution of recommender systems with technology
Mahout for scalable recommender systems
Apache Spark for scalable real-time recommender systems
Neo4j for real-time graph-based recommender systems
Summary
2. Build Your First Recommendation Engine
Building our basic recommendation engine
Loading and formatting data
Calculating similarity between users
Predicting the unknown ratings for users
Summary
3. Recommendation Engines Explained
Evolution of recommendation engines
Nearest neighborhood-based recommendation engines
User-based collaborative filtering
Item-based collaborative filtering
Advantages
Disadvantages
Content-based recommender systems
User profile generation
Advantages
Disadvantages
Context-aware recommender systems
Context definition
Pre-filtering approaches
Post-filtering approaches
Advantages
Disadvantages
Hybrid recommender systems
Weighted method
Mixed method
Cascade method
Feature combination method
Advantages
Model-based recommender systems
Probabilistic approaches
Machine learning approaches
Mathematical approaches
Advantages
Summary
4. Data Mining Techniques Used in Recommendation Engines
Neighbourhood-based techniques
Euclidean distance
Cosine similarity
Jaccard similarity
Pearson correlation coefficient
Mathematic model techniques
Matrix factorization
Alternating least squares
Singular value decomposition
Machine learning techniques
Linear regression
Classification models
Linear classification
KNN classification
Support vector machines
Decision trees
Ensemble methods
Random forests
Bagging
Boosting
Clustering techniques
K-means clustering
Dimensionality reduction
Principal component analysis
Vector space models
Term frequency
Term frequency inverse document frequency
Evaluation techniques
Cross-validation
Regularization
Root-mean-square error (RMSE)
Mean absolute error (MAE)
Precision and recall
Summary
5. Building Collaborative Filtering Recommendation Engines
Installing the recommenderlab package in RStudio
Datasets available in the recommenderlab package
Exploring the Jester5K dataset
Description
Usage
Format
Details
Exploring the dataset
Exploring the rating values
Building user-based collaborative filtering with recommenderlab
Preparing training and test data
Creating a user-based collaborative model
Predictions on the test set
Analyzing the dataset
Evaluating the recommendation model using the k-cross validation
Evaluating user-based collaborative filtering
Building an item-based recommender model
Building an IBCF recommender model
Model evaluation
Model accuracy using metrics
Model accuracy using plots
Parameter tuning for IBCF
Collaborative filtering using Python
Installing the required packages
Data source
Data exploration
Rating matrix representation
Creating training and test sets
The steps for building a UBCF
User-based similarity calculation
Predicting the unknown ratings for an active user
User-based collaborative filtering with the k-nearest neighbors
Finding the top-N nearest neighbors
Item-based recommendations
Evaluating the model
The training model for k-nearest neighbors
Evaluating the model
Summary
6. Building Personalized Recommendation Engines
Personalized recommender systems
Content-based recommender systems
Building a content-based recommendation system
Content-based recommendation using R
Dataset description
Content-based recommendation using Python
Dataset description
User activity
Item profile generation
User profile creation
Context-aware recommender systems
Building a context-aware recommender systems
Context-aware recommendations using R
Defining the context
Creating context profile
Generating context-aware recommendations
Summary
7. Building Real-Time Recommendation Engines with Spark
About Spark 2.0
Spark architecture
Spark components
Spark Core
Structured data with Spark SQL
Streaming analytics with Spark Streaming
Machine learning with MLlib
Graph computation with GraphX
Benefits of Spark
Setting up Spark
About SparkSession
Resilient Distributed Datasets (RDD)
About ML Pipelines
Collaborative filtering using Alternating Least Square
Model based recommender system using pyspark
MLlib recommendation engine module
The recommendation engine approach
Implementation
Data loading
Data exploration
Building the basic recommendation engine
Making predictions
User-based collaborative filtering
Model evaluation
Model selection and hyperparameter tuning
Cross-Validation
CrossValidator
Train-Validation Split
Setting the ParamMaps/parameters
Setting the evaluator object
Summary
8. Building Real-Time Recommendations with Neo4j
Discerning different graph databases
Labeled property graph
Understanding GraphDB core concepts
Neo4j
Cypher query language
Cypher query basics
Node syntax
Relationship syntax
Building your first graph
Creating nodes
Creating relationships
Setting properties to relations
Loading data from csv
Neo4j Windows installation
Installing Neo4j on the Linux platform
Downloading Neo4j
Setting up Neo4j
Starting Neo4j from the command line
Building recommendation engines
Loading data into Neo4j
Generating recommendations using Neo4j
Collaborative filtering using the Euclidean distance
Collaborative filtering using Cosine similarity
Summary
9. Building Scalable Recommendation Engines with Mahout
Mahout - a general introduction
Setting up Mahout
The standalone mode - using Mahout as a library
Setting Mahout for the distributed mode
Core building blocks of Mahout
Components of a user-based collaborative recommendation engine
Building recommendation engines using Mahout
Dataset description
User-based collaborative filtering
Item-based collaborative filtering
Evaluating collaborative filtering
Evaluating user-based recommenders
Evaluating item-based recommenders
SVD recommenders
Distributed recommendations using Mahout
ALS recommendation on Hadoop
The architecture for a scalable system
Summary
10. What Next - The Future of Recommendation Engines
Future of recommendation engines
Phases of recommendation engines
Phase 1 - general recommendation engines
Phase 2 - personalized recommender systems
Phase 3 - futuristic recommender systems
End of search
Leaving the Web behind
Emerging from the Web
Next best actions
Use cases to look out for
Smart homes
Healthcare recommender systems
News as recommendations
Popular methodologies
Serendipity
Temporal aspects of recommendation engines
A/B testing
Feedback mechanism
Summary

Building Recommendation Engines

Building Recommendation Engines

Copyright © 2016 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: December 2016

Production reference: 1231216

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham 

B3 2PB, UK.

ISBN 978-1-78588-485-6

www.packtpub.com

Credits

Author

Suresh Kumar Gorakala

Copy Editor

Manisha Sinha

Reviewers

Vikram Dhillon

Vimal Romeo

Project Coordinator

Nidhi Joshi

Commissioning Editor

Veena Pagare

Proofreader

Safis Editing

Acquisition Editor

Tushar Gupta

Indexer

Mariammal Chettiyar

Content Development Editor

Manthan Raja

Graphics

Disha Haria

Technical Editor

Dinesh Chaudhary

Production Coordinator

Arvindkumar Gupta

About the Author

Suresh Kumar Gorakala is a Data scientist focused on Artificial Intelligence. He has professional experience close to 10 years, having worked with various global clients across multiple domains and helped them in solving their business problems using Advanced Big Data Analytics. He has extensively worked on Recommendation Engines, Natural language Processing, Advanced Machine Learning, Graph Databases. He previously co-authored Building a Recommendation System with R for Packt Publishing. He is passionate traveler and is photographer by hobby.

I would like to thank my wife for putting up with my late-night writing sessions and all my family members for supporting me over the months. I also give deep thanks and gratitude to Barathi Ganesh, Raj Deepthi, Harsh and my colleagues who without their support this book quite possibly would not have happened. I would also like to thank all the mentors that I’ve had over the years. Without learning from these teachers, there is not a chance I could be doing what I do today, and it is because of them and others that I may not have listed here that I feel compelled to pass my knowledge on to those willing to learn. I would also like to thank all the reviewers and project managers of the book to make it a reality.

About the Reviewers

Vikram Dhillon is a software developer, a bioinformatics researcher, and a software coach at the Blackstone LaunchPad in the University of Central Florida. He has been working on his own startup involving healthcare data security of late. He lives in Orlando and regularly attends development meetups and hackathons. He enjoys spending his spare time reading about new technologies, such as the Blockchain and developing tutorials for machine learning in game design. He has been involved in open-source projects for over five years and writes about technology and startups at opsbug.com

Vimal Romeo is a data science at Ernst and Young, Rome. He holds a master’s degree in Big Data Analytics from Luiss Business School, Rome. He also holds an MBA degree from XIME ,India and a bachelor’s degree in computer science and engineering from CUSAT, India. He is an author at MilanoR which is a blog related to the R language.

I would like to thank my mom – Mrs Bernadit and my brother - Vibin for their continuous support. I would also like to thank my friends – Matteo Amadei, Antonella Di Luca, Asish Mathew and Eleonora Polidoro who supported me during this process. A special thanks to Nidhi Joshi from Packt Publishing for keeping me motivated during the process.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Love You Mom

Preface

Building Recommendation Engines is a comprehensive guide for implementing Recommendation Engines such as collaborative filtering, content based recommendation engines, context aware recommendation engines using R, Python, Spark, Mahout, Neo4j technologies. The book covers various recommendation engines widely used across industries with their implementations. This book also covers a chapter on popular datamining techniques commonly used in building recommendations and also discuss in brief about the future of recommendation engines at the end of the book.

What this book covers

Chapter 1, Introduction to Recommendation Engines, will be a refresher to Data Scientists and an introduction to the beginners of recommendation engines. This chapter introduces popular recommendation engines that people use in their day-to-day lives. Popular recommendation engine approaches available along with their pros and cons are covered.

Chapter 2, Build Your First Recommendation Engine, is a short chapter about how to build a movie recommendation engine to give a head start for us before we take off into the world of recommendation engines.

Chapter 3, Recommendation Engines Explained, is about different recommendation engine techniques popularly employed, such as user-based collaborative filtering recommendation engines, item-based collaborative filtering, content-based recommendation engines, context-aware recommenders, hybrid recommenders, model-based recommender systems using Machine Learning models and mathematical models.

Chapter 4, Data Mining Techniques Used in Recommendation Engines, is about various Machine Learning techniques used in building recommendation engines such as similarity measures, classification, regression, and dimension reduction techniques. This chapter also covers evaluation metrics to test the recommendation engine’s predictive power.

Chapter 5, Building Collaborative Filtering Recommendation Engines, is about how to build user-based collaborative filtering and item-based collaborative filtering in R and Python. We'll also learn about different libraries available in R and Python that are extensively used in building recommendation engines.

Chapter 6, Building Personalized Recommendation Engines, is about how to build personalized recommendation engines using R and Python and the various libraries used for building content-based recommender systems and context-aware recommendation engines.

Chapter 7, Building Real-Time Recommendation Engines with Spark, is about the basics of Spark and MLlib required for building real-time recommender systems.

Chapter 8, Building Real-Time Recommendation Engines with Neo4j, is about the basics of graphDB and Neo4j concepts and how to build real-time recommender systems using Neo4j.

Chapter 9, Building Scalable Recommendation Engines with Mahout, is about the basic building blocks of Hadoop and Mahout required for building scalable recommender systems. It also covers the architecture we use to build scalable systems and a step-by-step implementation using Mahout and SVD.

Chapter 10, What Next?, is the final chapter explaining the summary of what we have learned so far: best practices that are employed in building the decision-making systems and where the future of the recommender systems are set to move.

What you need for this book

To get started with different implementations of recommendation engines in R, Python, Spark, Neo4j, Mahout we need the following software:

Chapter number

Software required (With version)

Download links to the software

OS required

2,4,5

R studio Version 0.99.489

https://www.rstudio.com/products/rstudio/download/

WINDOWS 7+/Centos 6

2,4,5

R version 3.2.2 

https://cran.r-project.org/bin/windows/base/

WINDOWS 7+/Centos 6

5,6,7

Anaconda 4.2 for Python 3.5

https://www.continuum.io/downloads

WINDOWS 7+/Centos 6

8

Neo4j 3.0.6

https://neo4j.com/download/

WINDOWS 7+/Centos 6

7

Spark 2.0

https://spark.apache.org/downloads.html

WINDOWS 7+/Centos 6

9

Hadoop 2.5 -Mahout 0.12

http://hadoop.apache.org/releases.html

http://mahout.apache.org/general/downloads.html

WINDOWS 7+/Centos 6

7,9,8

Java 7/Java 8

http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

WINDOWS 7+/Centos 6

Who this book is for

This book caters to beginners and experienced data scientists looking to understand and build complex predictive decision-making systems, recommendation engines using R, Python, Spark, Neo4j, and Hadoop.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/building-recommendation-engines. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from http://www.packtpub.com/sites/default/files/downloads/BuildingRecommendationEngines_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Need for recommender systems

Given the complexity and challenges in building recommendation engines, a considerable amount of thought, skill, investment, and technology goes into building recommender systems. Are they worth such an investment? Let us look at some facts:

Two-thirds of movies watched by Netflix customers are recommended movies38% of click-through rates on Google News are recommended links35% of sales at Amazon arise from recommended productsChoiceStream claims that 28% of people would like to buy more music, if they find what they like

Big data driving the recommender systems

Of late, recommender systems are successful in impacting our lives in many ways. One such obvious example of this impact is how our online shopping experience has been redefined. As we browse through e-commerce sites and purchase products, the underlying recommendation engines respond immediately, in real time, with various relevant suggestions to consumers. Regardless of the perspective, from business player or consumer, recommendation engines have been immensely beneficial. Without a doubt, big data is the driving force behind recommender systems. A good recommendation engine should be reliable, scalable, highly available, and be able to provide personalized recommendations, in real time, to the large user base it contains.

A typical recommendation system cannot do its job efficiently without sufficient data. The introduction of big data technology enabled companies to capture plenty of user data, such as past purchases, browsing history, and feedback information, and feed it to the recommendation engines to generate relevant and effective recommendations in real time. In short, even the most advanced recommender system cannot be effective without the supply of big data. The role of big data and improvements in technology, both on the software and hardware front, goes beyond just supplying massive data. It also provides meaningful, actionable data fast, and provides the necessary setup to quickly process the data in real time.

Source: http://www.kdnuggets.com/2015/10/big-data-recommendation-systems-change-lives.html.

Types of recommender systems

Now that we have defined recommender systems, their objective, usefulness, and the driving force behind recommender systems, in this section, we introduce different types of popular recommender systems in use.

Collaborative filtering recommender systems

Collaborative filtering recommender systems are basic forms of recommendation engines. In this type of recommendation engine, filtering items from a large set of alternatives is done collaboratively by users' preferences.

The basic assumption in a collaborative filtering recommender system is that if two users shared the same interests as each other in the past, they will also have similar tastes in the future. If, for example, user A and user B have similar movie preferences, and user A recently watched Titanic, which user B has not yet seen, then the idea is to recommend this unseen new movie to user B. The movie recommendations on Netflix are one good example of this type of recommender system.

There are two types of collaborative filtering recommender systems:

User-based collaborative filtering: In user-based collaborative filtering, recommendations are generated by considering the preferences in the user's neighborhood. User-based collaborative filtering is done in two steps:
Identify similar users based on similar user preferencesRecommend new items to an active user based on the rating given by similar users on the items not rated by the active user.
Item-based collaborative filtering: In item-based collaborative filtering, the recommendations are generated using the neighbourhood of items. Unlike user-based collaborative filtering, we first find similarities between items and then recommend non-rated items which are similar to the items the active user has rated in past. Item-based recommender systems are constructed in two steps:
Calculate the item similarity based on the item preferencesFind the top similar items to the non-rated items by active user and recommend them

We will learn in depth about these two forms of recommendations in Chapter 3, Recommendation Engines Explained.

While building collaborative filtering recommender systems, we will learn about the following aspects:

How to calculate the similarity between users?How to calculate the similarity between items?How recommendations are generated?How to deal with new items and new users whose data is not known?

The advantage of collaborative filtering systems is that they are simple to implement and very accurate. However, they have their own set of limitations, such as the Cold Start problem, which means, collaborative filtering systems fails to recommend to the first-time users whose information is not available in the system:

Content-based recommender systems

In collaborative filtering, we consider only user-item-preferences and build the recommender systems. Though this approach is accurate, it makes more sense if we consider user properties and item properties while building recommendation engines. Unlike in collaborative filtering, we use item properties and user preferences to the item properties while building content-based recommendation engines.

As the name indicates, a content-based recommender system uses the content information of the items for building the recommendation model. A content recommender system typically contains a user-profile-generation step, item-profile-generation step- and model-building step to generate recommendations for an active user. The content-based recommender system recommends items to users by taking the content or features of items and user profiles. As an example, if you have searched for videos of Lionel Messi on YouTube, then the content-based recommender system will learn your preference and recommend other videos related to Lionel Messi and other videos related to football.

In simpler terms, the system recommends items similar to those that the user has liked in the past. The similarity of items is calculated based on the features associated with the other compared items and is matched with the user's historical preferences.

While building a content-based recommendation system, we take into consideration the following questions:

How do we choose content or features of the products?How do we create user profiles with preferences similar to that of the product content?How do we create similarity between items based on their features?How do we create and update user profiles continuously?

The preceding considerations will be explained in Chapter 3, Recommendation Engines Explained. This technique doesn't take into consideration the user's neighborhood preferences. Hence, it doesn't require a large user group's preference for items for better recommendation accuracy. It only considers the user's past preferences and the properties/features of the items. In Chapter 3, Recommendation Engines Explained, we will learn about this system in detail, and also its pros and cons:

Hybrid recommender systems

This type of recommendation engine is built by combining various recommender systems to build a more robust system. By combining various recommender systems, we can replace the disadvantages of one system with the advantages of another system and thus build a more robust system. For example, by combining collaborative filtering methods, where the model fails when new items don't have ratings, with content-based systems, where feature information about the items is available, new items can be recommended more accurately and efficiently.

For example, if you are a frequent reader of news on Google News, the underlying recommendation engine recommends news articles to you by combining popular news articles read by people similar to you and using your personal preferences, calculated using your previous click information. With this type of recommendation system, collaborative filtering recommendations are combined with content-based recommendations before pushing recommendations.

Before building a hybrid model, we should consider the following questions:

What recommender techniques should be combined to achieve the business solution?How should we combine various techniques and their results for better predictions?

The advantage of hybrid recommendation engines is that this approach will increase the efficiency of recommendations compared to the individual recommendation techniques. This approach also suggests a good mix of recommendations to the users, both at the personalized level and at the neighborhood level. In Chapter 3, Recommendation Engines Explained, we will learn more about hybrid recommendations:

Context-aware recommender systems

Personalized recommender systems, such as content-based recommender systems, are inefficient; they fail to suggest recommendations with respect to context. For example, assume a lady is very fond of ice-cream. Also assume that this lady goes to a cold place. Now there is high chance that a personalized recommender system suggests a popular ice-cream brand. Now let us ask our self a question: is it the right thing to suggest an ice-cream to a person in a cold place? Rather, it makes sense to suggest a coffee. This type of recommendation, which is personalized and context-aware is called a context-aware recommender systems. In the preceding example, place is the context.

User preferences may differ with the context, such as time of day, season, mood, place, location, options offered by the system, and so on. A person at a different location at a different time with different people may need different things. A context-aware recommender system takes the context into account before computing or serving recommendations. This recommender system caters for the different needs of people differently in different contexts.

Before building a context-aware model, we should consider the following questions:

How should we define the contexts to be used in the recommender system?What techniques should be used to build recommendations to achieve the business solution?How do we extract context the preferences of the users with respect to the products?What techniques should we use to combine the context preferences with user-profile preferences to generate recommendations?

The preceding image shows how different people, at different times and places, and with different company, need different dress recommendations.

Evolution of recommender systems with technology

With the advancements in technology, research, and infrastructure, recommender systems have been evolving rapidly. Recommender systems are moving away from simple similarity-measure-based approaches, to machine-learning approaches, to very advanced approaches such as deep learning. From a business angle, both customers and organizations are looking toward more personalized recommendations to be catered for immediately. Building personalized recommenders to cater to the large user base and products, we need sophisticated systems, which can scale easily and respond fast. The following are the types of recommendations that can help solve this challenge.

Mahout for scalable recommender systems

As stated earlier, big data primarily drives recommender systems. The big-data platforms enabled researchers to access large datasets and analyze data at the individual level, paving paths for building personalized recommender systems. With increase in Internet usage and a constant supply of data, efficient recommenders not only require huge data, but also need infrastructure which can scale and have minimum downtime. To realize this, big-data technology such as the Apache Hadoop ecosystem provided the infrastructure and platform to supply large data. To build recommendation systems on this huge supply of data, Mahout, a machine-learning library built on the Hadoop platform enables us to build scalable recommender systems. Mahout provides infrastructure to build, evaluate, and tune the different types of recommendation-engine algorithms. Since Hadoop is designed for offline batch processing, we can build offline recommender systems, which are scalable. In Chapter 9, Building Scalable Recommendation Engines with Mahout, we further see how to build scalable recommendation engines using Mahout.

The following figure displays how a scalable recommender system can be designed using Mahout:

Apache Spark for scalable real-time recommender systems

We have seen many times, on any of the e-commerce sites, the You may also like feature. This is a deceptively simple phrase that encapsulates a new era in customer relationship management delivered in real time. Business organizations started investing in such systems, which can generate recommendations personalized to the customers and can deliver them in real time. Building such a system will not only give good returns on investment but also, efficient systems will buy the confidence of the users. Building a scalable real-time recommender system will not only capture users' purchase history, product information, user preferences, and extract patterns and recommend products, but will also respond instantly based on user online interactions and multi-criteria search preferences.

This ability makes compelling suggestions requiring a new generation of technology. This technology has to consider large databases of users' previous purchasing history, their preferences, and online interaction information such as in-page navigation data and multi-criteria searches, and then analyzes all this information in real time and responds accurately according to the current and long-term needs of the users. In this book, we have considered in-memory and graph-based systems, which are capable of handling large-scale, real-time recommender systems.

Most popular recommendation engine collaborative filtering requires considering the entirety of users and product information while generating recommendations. Assume a scenario where we have 1 million user ratings on 10,000 products. In order to build a system to handle such heavy computations and respond online, we require a system that is big-data compatible and processes data in-memory. The key technology in enabling scalable, real-time recommendations is Apache Spark Streaming, a technology that leverages scalability of big data and generates recommendations in real time, and processes data in-memory:

Neo4j for real-time graph-based recommender systems

Graph databases have revolutionized the way people discover new products, information, and so on. In the human mind, we remember people, things, places, and so on, as graphs, relations, and networks. When we try to fetch information from these networks, we directly go to a required connection or graph and fetch information accurately. In a similar fashion, graph databases allow us to store user and product information in graphs as nodes and edges (relations). Searching in a graph database is fast. In recent times, recommender systems powered by graph databases have allowed organizations to build suggestions which are personalized and accurate in real time.

One of the key technologies enabling real-time recommendations using graph databases is Neo4j, a kind of NoSQL graph database that can easily outperform any other relational and NoSQL system in providing customer insights and product trends.

A NoSQL database, popularly known as not only SQL