E-Book
41,99 €

Building Recommendation Engines E-Book

Suresh Kumar Gorakala

0,0

41,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Understand your data and user preferences to make intelligent, accurate, and profitable decisions

About This Book

A step-by-step guide to building recommendation engines that are personalized, scalable, and real time
Get to grips with the best tool available on the market to create recommender systems
This hands-on guide shows you how to implement different tools for recommendation engines, and when to use which

Who This Book Is For

This book caters to beginners and experienced data scientists looking to understand and build complex predictive decision-making systems, recommendation engines using R, Python, Spark, Neo4j, and Hadoop.

What You Will Learn

Build your first recommendation engine
Discover the tools needed to build recommendation engines
Dive into the various techniques of recommender systems such as collaborative, content-based, and cross-recommendations
Create efficient decision-making systems that will ease your work
Familiarize yourself with machine learning algorithms in different frameworks
Master different versions of recommendation engines from practical code examples
Explore various recommender systems and implement them in popular techniques with R, Python, Spark, and others

In Detail

A recommendation engine (sometimes referred to as a recommender system) is a tool that lets algorithm developers predict what a user may or may not like among a list of given items. Recommender systems have become extremely common in recent years, and are applied in a variety of applications. The most popular ones are movies, music, news, books, research articles, search queries, social tags, and products in general.

The book starts with an introduction to recommendation systems and its applications. You will then start building recommendation engines straight away from the very basics. As you move along, you will learn to build recommender systems with popular frameworks such as R, Python, Spark, Neo4j, and Hadoop. You will get an insight into the pros and cons of each recommendation engine and when to use which recommendation to ensure each pick is the one that suits you the best.

During the course of the book, you will create simple recommendation engine, real-time recommendation engine, scalable recommendation engine, and more. You will familiarize yourselves with various techniques of recommender systems such as collaborative, content-based, and cross-recommendations before getting to know the best practices of building a recommender system towards the end of the book!

Style and approach

This book follows a step-by-step practical approach where users will learn to build recommendation engines with increasing complexity in every chapter

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 297

Veröffentlichungsjahr: 2016

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Building Recommendation Engines

Credits

About the Author

About the Reviewers

www.PacktPub.com

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. Introduction to Recommendation Engines

Recommendation engine definition

Need for recommender systems

Big data driving the recommender systems

Types of recommender systems

Collaborative filtering recommender systems

Content-based recommender systems

Hybrid recommender systems

Context-aware recommender systems

Evolution of recommender systems with technology

Mahout for scalable recommender systems

Apache Spark for scalable real-time recommender systems

Neo4j for real-time graph-based recommender systems

Summary

2. Build Your First Recommendation Engine

Building our basic recommendation engine

Loading and formatting data

Calculating similarity between users

Predicting the unknown ratings for users

Summary

3. Recommendation Engines Explained

Evolution of recommendation engines

Nearest neighborhood-based recommendation engines

User-based collaborative filtering

Item-based collaborative filtering

Advantages

Disadvantages

Content-based recommender systems

User profile generation

Advantages

Disadvantages

Context-aware recommender systems

Context definition

Pre-filtering approaches

Post-filtering approaches

Advantages

Disadvantages

Hybrid recommender systems

Weighted method

Mixed method

Cascade method

Feature combination method

Advantages

Model-based recommender systems

Probabilistic approaches

Machine learning approaches

Mathematical approaches

Advantages

Summary

4. Data Mining Techniques Used in Recommendation Engines

Neighbourhood-based techniques

Euclidean distance

Cosine similarity

Jaccard similarity

Pearson correlation coefficient

Mathematic model techniques

Matrix factorization

Alternating least squares

Singular value decomposition

Machine learning techniques

Linear regression

Classification models

Linear classification

KNN classification

Support vector machines

Decision trees

Ensemble methods

Random forests

Bagging

Boosting

Clustering techniques

K-means clustering

Dimensionality reduction

Principal component analysis

Vector space models

Term frequency

Term frequency inverse document frequency

Evaluation techniques

Cross-validation

Regularization

Root-mean-square error (RMSE)

Mean absolute error (MAE)

Precision and recall

Summary

5. Building Collaborative Filtering Recommendation Engines

Installing the recommenderlab package in RStudio

Datasets available in the recommenderlab package

Exploring the Jester5K dataset

Description

Usage

Format

Details

Exploring the dataset

Exploring the rating values

Building user-based collaborative filtering with recommenderlab

Preparing training and test data

Creating a user-based collaborative model

Predictions on the test set

Analyzing the dataset

Evaluating the recommendation model using the k-cross validation

Evaluating user-based collaborative filtering

Building an item-based recommender model

Building an IBCF recommender model

Model evaluation

Model accuracy using metrics

Model accuracy using plots

Parameter tuning for IBCF

Collaborative filtering using Python

Installing the required packages

Data source

Data exploration

Rating matrix representation

Creating training and test sets

The steps for building a UBCF

User-based similarity calculation

Predicting the unknown ratings for an active user

User-based collaborative filtering with the k-nearest neighbors

Finding the top-N nearest neighbors

Item-based recommendations

Evaluating the model

The training model for k-nearest neighbors

Evaluating the model

Summary

6. Building Personalized Recommendation Engines

Personalized recommender systems

Content-based recommender systems

Building a content-based recommendation system

Content-based recommendation using R

Dataset description

Content-based recommendation using Python

Dataset description

User activity

Item profile generation

User profile creation

Context-aware recommender systems

Building a context-aware recommender systems

Context-aware recommendations using R

Defining the context

Creating context profile

Generating context-aware recommendations

Summary

7. Building Real-Time Recommendation Engines with Spark

About Spark 2.0

Spark architecture

Spark components

Spark Core

Structured data with Spark SQL

Streaming analytics with Spark Streaming

Machine learning with MLlib

Graph computation with GraphX

Benefits of Spark

Setting up Spark

About SparkSession

Resilient Distributed Datasets (RDD)

About ML Pipelines

Collaborative filtering using Alternating Least Square

Model based recommender system using pyspark

MLlib recommendation engine module

The recommendation engine approach

Implementation

Data loading

Data exploration

Building the basic recommendation engine

Making predictions

User-based collaborative filtering

Model evaluation

Model selection and hyperparameter tuning

Cross-Validation

CrossValidator

Train-Validation Split

Setting the ParamMaps/parameters

Setting the evaluator object

Summary

8. Building Real-Time Recommendations with Neo4j

Discerning different graph databases

Labeled property graph

Understanding GraphDB core concepts

Neo4j

Cypher query language

Cypher query basics

Node syntax

Relationship syntax

Building your first graph

Creating nodes

Creating relationships

Setting properties to relations

Loading data from csv

Neo4j Windows installation

Installing Neo4j on the Linux platform

Downloading Neo4j

Setting up Neo4j

Starting Neo4j from the command line

Building recommendation engines

Loading data into Neo4j

Generating recommendations using Neo4j

Collaborative filtering using the Euclidean distance

Collaborative filtering using Cosine similarity

Summary

9. Building Scalable Recommendation Engines with Mahout

Mahout - a general introduction

Setting up Mahout

The standalone mode - using Mahout as a library

Setting Mahout for the distributed mode

Core building blocks of Mahout

Components of a user-based collaborative recommendation engine

Building recommendation engines using Mahout

Dataset description

User-based collaborative filtering

Item-based collaborative filtering

Evaluating collaborative filtering

Evaluating user-based recommenders

Evaluating item-based recommenders

SVD recommenders

Distributed recommendations using Mahout

ALS recommendation on Hadoop

The architecture for a scalable system

Summary

10. What Next - The Future of Recommendation Engines

Future of recommendation engines

Phases of recommendation engines

Phase 1 - general recommendation engines

Phase 2 - personalized recommender systems

Phase 3 - futuristic recommender systems

End of search

Leaving the Web behind

Emerging from the Web

Next best actions

Use cases to look out for

Smart homes

Healthcare recommender systems

News as recommendations

Popular methodologies

Serendipity

Temporal aspects of recommendation engines

A/B testing

Feedback mechanism

Summary

Building Recommendation Engines

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: December 2016

Production reference: 1231216

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78588-485-6

www.packtpub.com

Credits

Author

Suresh Kumar Gorakala

Copy Editor

Manisha Sinha

Reviewers

Vikram Dhillon

Vimal Romeo

Project Coordinator

Nidhi Joshi

Commissioning Editor

Veena Pagare

Proofreader

Safis Editing

Acquisition Editor

Tushar Gupta

Indexer

Mariammal Chettiyar

Content Development Editor

Manthan Raja

Graphics

Disha Haria

Technical Editor

Dinesh Chaudhary

Production Coordinator

Arvindkumar Gupta

About the Author

Suresh Kumar Gorakala is a Data scientist focused on Artificial Intelligence. He has professional experience close to 10 years, having worked with various global clients across multiple domains and helped them in solving their business problems using Advanced Big Data Analytics. He has extensively worked on Recommendation Engines, Natural language Processing, Advanced Machine Learning, Graph Databases. He previously co-authored Building a Recommendation System with R for Packt Publishing. He is passionate traveler and is photographer by hobby.

I would like to thank my wife for putting up with my late-night writing sessions and all my family members for supporting me over the months. I also give deep thanks and gratitude to Barathi Ganesh, Raj Deepthi, Harsh and my colleagues who without their support this book quite possibly would not have happened. I would also like to thank all the mentors that I’ve had over the years. Without learning from these teachers, there is not a chance I could be doing what I do today, and it is because of them and others that I may not have listed here that I feel compelled to pass my knowledge on to those willing to learn. I would also like to thank all the reviewers and project managers of the book to make it a reality.

About the Reviewers

Vikram Dhillon is a software developer, a bioinformatics researcher, and a software coach at the Blackstone LaunchPad in the University of Central Florida. He has been working on his own startup involving healthcare data security of late. He lives in Orlando and regularly attends development meetups and hackathons. He enjoys spending his spare time reading about new technologies, such as the Blockchain and developing tutorials for machine learning in game design. He has been involved in open-source projects for over five years and writes about technology and startups at opsbug.com

Vimal Romeo is a data science at Ernst and Young, Rome. He holds a master’s degree in Big Data Analytics from Luiss Business School, Rome. He also holds an MBA degree from XIME ,India and a bachelor’s degree in computer science and engineering from CUSAT, India. He is an author at MilanoR which is a blog related to the R language.

I would like to thank my mom – Mrs Bernadit and my brother - Vibin for their continuous support. I would also like to thank my friends – Matteo Amadei, Antonella Di Luca, Asish Mathew and Eleonora Polidoro who supported me during this process. A special thanks to Nidhi Joshi from Packt Publishing for keeping me motivated during the process.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Love You Mom

Preface

Building Recommendation Engines is a comprehensive guide for implementing Recommendation Engines such as collaborative filtering, content based recommendation engines, context aware recommendation engines using R, Python, Spark, Mahout, Neo4j technologies. The book covers various recommendation engines widely used across industries with their implementations. This book also covers a chapter on popular datamining techniques commonly used in building recommendations and also discuss in brief about the future of recommendation engines at the end of the book.

What this book covers

Chapter 1, Introduction to Recommendation Engines, will be a refresher to Data Scientists and an introduction to the beginners of recommendation engines. This chapter introduces popular recommendation engines that people use in their day-to-day lives. Popular recommendation engine approaches available along with their pros and cons are covered.

Chapter 2, Build Your First Recommendation Engine, is a short chapter about how to build a movie recommendation engine to give a head start for us before we take off into the world of recommendation engines.

Chapter 3, Recommendation Engines Explained, is about different recommendation engine techniques popularly employed, such as user-based collaborative filtering recommendation engines, item-based collaborative filtering, content-based recommendation engines, context-aware recommenders, hybrid recommenders, model-based recommender systems using Machine Learning models and mathematical models.

Chapter 4, Data Mining Techniques Used in Recommendation Engines, is about various Machine Learning techniques used in building recommendation engines such as similarity measures, classification, regression, and dimension reduction techniques. This chapter also covers evaluation metrics to test the recommendation engine’s predictive power.

Chapter 5, Building Collaborative Filtering Recommendation Engines, is about how to build user-based collaborative filtering and item-based collaborative filtering in R and Python. We'll also learn about different libraries available in R and Python that are extensively used in building recommendation engines.

Chapter 6, Building Personalized Recommendation Engines, is about how to build personalized recommendation engines using R and Python and the various libraries used for building content-based recommender systems and context-aware recommendation engines.

Chapter 7, Building Real-Time Recommendation Engines with Spark, is about the basics of Spark and MLlib required for building real-time recommender systems.

Chapter 8, Building Real-Time Recommendation Engines with Neo4j, is about the basics of graphDB and Neo4j concepts and how to build real-time recommender systems using Neo4j.

Chapter 9, Building Scalable Recommendation Engines with Mahout, is about the basic building blocks of Hadoop and Mahout required for building scalable recommender systems. It also covers the architecture we use to build scalable systems and a step-by-step implementation using Mahout and SVD.

Chapter 10, What Next?, is the final chapter explaining the summary of what we have learned so far: best practices that are employed in building the decision-making systems and where the future of the recommender systems are set to move.

What you need for this book

To get started with different implementations of recommendation engines in R, Python, Spark, Neo4j, Mahout we need the following software:

Chapter number

Software required (With version)

Download links to the software

OS required

2,4,5

R studio Version 0.99.489

https://www.rstudio.com/products/rstudio/download/

WINDOWS 7+/Centos 6

2,4,5

R version 3.2.2

https://cran.r-project.org/bin/windows/base/

WINDOWS 7+/Centos 6

5,6,7

Anaconda 4.2 for Python 3.5

https://www.continuum.io/downloads

WINDOWS 7+/Centos 6

Neo4j 3.0.6

https://neo4j.com/download/

WINDOWS 7+/Centos 6

Spark 2.0

https://spark.apache.org/downloads.html

WINDOWS 7+/Centos 6

Hadoop 2.5 -Mahout 0.12

http://hadoop.apache.org/releases.html

http://mahout.apache.org/general/downloads.html

WINDOWS 7+/Centos 6

7,9,8

Java 7/Java 8

http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

WINDOWS 7+/Centos 6

Who this book is for

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/building-recommendation-engines. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from http://www.packtpub.com/sites/default/files/downloads/BuildingRecommendationEngines_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Need for recommender systems

Given the complexity and challenges in building recommendation engines, a considerable amount of thought, skill, investment, and technology goes into building recommender systems. Are they worth such an investment? Let us look at some facts:

Two-thirds of movies watched by Netflix customers are recommended movies38% of click-through rates on Google News are recommended links35% of sales at Amazon arise from recommended productsChoiceStream claims that 28% of people would like to buy more music, if they find what they like

Big data driving the recommender systems

Of late, recommender systems are successful in impacting our lives in many ways. One such obvious example of this impact is how our online shopping experience has been redefined. As we browse through e-commerce sites and purchase products, the underlying recommendation engines respond immediately, in real time, with various relevant suggestions to consumers. Regardless of the perspective, from business player or consumer, recommendation engines have been immensely beneficial. Without a doubt, big data is the driving force behind recommender systems. A good recommendation engine should be reliable, scalable, highly available, and be able to provide personalized recommendations, in real time, to the large user base it contains.

A typical recommendation system cannot do its job efficiently without sufficient data. The introduction of big data technology enabled companies to capture plenty of user data, such as past purchases, browsing history, and feedback information, and feed it to the recommendation engines to generate relevant and effective recommendations in real time. In short, even the most advanced recommender system cannot be effective without the supply of big data. The role of big data and improvements in technology, both on the software and hardware front, goes beyond just supplying massive data. It also provides meaningful, actionable data fast, and provides the necessary setup to quickly process the data in real time.

Source: http://www.kdnuggets.com/2015/10/big-data-recommendation-systems-change-lives.html.

Types of recommender systems

Now that we have defined recommender systems, their objective, usefulness, and the driving force behind recommender systems, in this section, we introduce different types of popular recommender systems in use.

Collaborative filtering recommender systems

Collaborative filtering recommender systems are basic forms of recommendation engines. In this type of recommendation engine, filtering items from a large set of alternatives is done collaboratively by users' preferences.

The basic assumption in a collaborative filtering recommender system is that if two users shared the same interests as each other in the past, they will also have similar tastes in the future. If, for example, user A and user B have similar movie preferences, and user A recently watched Titanic, which user B has not yet seen, then the idea is to recommend this unseen new movie to user B. The movie recommendations on Netflix are one good example of this type of recommender system.

There are two types of collaborative filtering recommender systems:

User-based collaborative filtering: In user-based collaborative filtering, recommendations are generated by considering the preferences in the user's neighborhood. User-based collaborative filtering is done in two steps:

Identify similar users based on similar user preferencesRecommend new items to an active user based on the rating given by similar users on the items not rated by the active user.

Item-based collaborative filtering: In item-based collaborative filtering, the recommendations are generated using the neighbourhood of items. Unlike user-based collaborative filtering, we first find similarities between items and then recommend non-rated items which are similar to the items the active user has rated in past. Item-based recommender systems are constructed in two steps:

Calculate the item similarity based on the item preferencesFind the top similar items to the non-rated items by active user and recommend them

We will learn in depth about these two forms of recommendations in Chapter 3, Recommendation Engines Explained.

While building collaborative filtering recommender systems, we will learn about the following aspects:

How to calculate the similarity between users?How to calculate the similarity between items?How recommendations are generated?How to deal with new items and new users whose data is not known?

The advantage of collaborative filtering systems is that they are simple to implement and very accurate. However, they have their own set of limitations, such as the Cold Start problem, which means, collaborative filtering systems fails to recommend to the first-time users whose information is not available in the system:

Content-based recommender systems

In collaborative filtering, we consider only user-item-preferences and build the recommender systems. Though this approach is accurate, it makes more sense if we consider user properties and item properties while building recommendation engines. Unlike in collaborative filtering, we use item properties and user preferences to the item properties while building content-based recommendation engines.

As the name indicates, a content-based recommender system uses the content information of the items for building the recommendation model. A content recommender system typically contains a user-profile-generation step, item-profile-generation step- and model-building step to generate recommendations for an active user. The content-based recommender system recommends items to users by taking the content or features of items and user profiles. As an example, if you have searched for videos of Lionel Messi on YouTube, then the content-based recommender system will learn your preference and recommend other videos related to Lionel Messi and other videos related to football.

In simpler terms, the system recommends items similar to those that the user has liked in the past. The similarity of items is calculated based on the features associated with the other compared items and is matched with the user's historical preferences.

While building a content-based recommendation system, we take into consideration the following questions:

How do we choose content or features of the products?How do we create user profiles with preferences similar to that of the product content?How do we create similarity between items based on their features?How do we create and update user profiles continuously?

The preceding considerations will be explained in Chapter 3, Recommendation Engines Explained. This technique doesn't take into consideration the user's neighborhood preferences. Hence, it doesn't require a large user group's preference for items for better recommendation accuracy. It only considers the user's past preferences and the properties/features of the items. In Chapter 3, Recommendation Engines Explained, we will learn about this system in detail, and also its pros and cons:

Hybrid recommender systems

This type of recommendation engine is built by combining various recommender systems to build a more robust system. By combining various recommender systems, we can replace the disadvantages of one system with the advantages of another system and thus build a more robust system. For example, by combining collaborative filtering methods, where the model fails when new items don't have ratings, with content-based systems, where feature information about the items is available, new items can be recommended more accurately and efficiently.

For example, if you are a frequent reader of news on Google News, the underlying recommendation engine recommends news articles to you by combining popular news articles read by people similar to you and using your personal preferences, calculated using your previous click information. With this type of recommendation system, collaborative filtering recommendations are combined with content-based recommendations before pushing recommendations.

Before building a hybrid model, we should consider the following questions:

What recommender techniques should be combined to achieve the business solution?How should we combine various techniques and their results for better predictions?

The advantage of hybrid recommendation engines is that this approach will increase the efficiency of recommendations compared to the individual recommendation techniques. This approach also suggests a good mix of recommendations to the users, both at the personalized level and at the neighborhood level. In Chapter 3, Recommendation Engines Explained, we will learn more about hybrid recommendations:

Context-aware recommender systems

Personalized recommender systems, such as content-based recommender systems, are inefficient; they fail to suggest recommendations with respect to context. For example, assume a lady is very fond of ice-cream. Also assume that this lady goes to a cold place. Now there is high chance that a personalized recommender system suggests a popular ice-cream brand. Now let us ask our self a question: is it the right thing to suggest an ice-cream to a person in a cold place? Rather, it makes sense to suggest a coffee. This type of recommendation, which is personalized and context-aware is called a context-aware recommender systems. In the preceding example, place is the context.

User preferences may differ with the context, such as time of day, season, mood, place, location, options offered by the system, and so on. A person at a different location at a different time with different people may need different things. A context-aware recommender system takes the context into account before computing or serving recommendations. This recommender system caters for the different needs of people differently in different contexts.

Before building a context-aware model, we should consider the following questions:

How should we define the contexts to be used in the recommender system?What techniques should be used to build recommendations to achieve the business solution?How do we extract context the preferences of the users with respect to the products?What techniques should we use to combine the context preferences with user-profile preferences to generate recommendations?

The preceding image shows how different people, at different times and places, and with different company, need different dress recommendations.

Evolution of recommender systems with technology

With the advancements in technology, research, and infrastructure, recommender systems have been evolving rapidly. Recommender systems are moving away from simple similarity-measure-based approaches, to machine-learning approaches, to very advanced approaches such as deep learning. From a business angle, both customers and organizations are looking toward more personalized recommendations to be catered for immediately. Building personalized recommenders to cater to the large user base and products, we need sophisticated systems, which can scale easily and respond fast. The following are the types of recommendations that can help solve this challenge.

Mahout for scalable recommender systems

As stated earlier, big data primarily drives recommender systems. The big-data platforms enabled researchers to access large datasets and analyze data at the individual level, paving paths for building personalized recommender systems. With increase in Internet usage and a constant supply of data, efficient recommenders not only require huge data, but also need infrastructure which can scale and have minimum downtime. To realize this, big-data technology such as the Apache Hadoop ecosystem provided the infrastructure and platform to supply large data. To build recommendation systems on this huge supply of data, Mahout, a machine-learning library built on the Hadoop platform enables us to build scalable recommender systems. Mahout provides infrastructure to build, evaluate, and tune the different types of recommendation-engine algorithms. Since Hadoop is designed for offline batch processing, we can build offline recommender systems, which are scalable. In Chapter 9, Building Scalable Recommendation Engines with Mahout, we further see how to build scalable recommendation engines using Mahout.

The following figure displays how a scalable recommender system can be designed using Mahout:

Apache Spark for scalable real-time recommender systems

We have seen many times, on any of the e-commerce sites, the You may also like feature. This is a deceptively simple phrase that encapsulates a new era in customer relationship management delivered in real time. Business organizations started investing in such systems, which can generate recommendations personalized to the customers and can deliver them in real time. Building such a system will not only give good returns on investment but also, efficient systems will buy the confidence of the users. Building a scalable real-time recommender system will not only capture users' purchase history, product information, user preferences, and extract patterns and recommend products, but will also respond instantly based on user online interactions and multi-criteria search preferences.

This ability makes compelling suggestions requiring a new generation of technology. This technology has to consider large databases of users' previous purchasing history, their preferences, and online interaction information such as in-page navigation data and multi-criteria searches, and then analyzes all this information in real time and responds accurately according to the current and long-term needs of the users. In this book, we have considered in-memory and graph-based systems, which are capable of handling large-scale, real-time recommender systems.

Most popular recommendation engine collaborative filtering requires considering the entirety of users and product information while generating recommendations. Assume a scenario where we have 1 million user ratings on 10,000 products. In order to build a system to handle such heavy computations and respond online, we require a system that is big-data compatible and processes data in-memory. The key technology in enabling scalable, real-time recommendations is Apache Spark Streaming, a technology that leverages scalability of big data and generates recommendations in real time, and processes data in-memory:

Neo4j for real-time graph-based recommender systems

Graph databases have revolutionized the way people discover new products, information, and so on. In the human mind, we remember people, things, places, and so on, as graphs, relations, and networks. When we try to fetch information from these networks, we directly go to a required connection or graph and fetch information accurately. In a similar fashion, graph databases allow us to store user and product information in graphs as nodes and edges (relations). Searching in a graph database is fast. In recent times, recommender systems powered by graph databases have allowed organizations to build suggestions which are personalized and accurate in real time.

One of the key technologies enabling real-time recommendations using graph databases is Neo4j, a kind of NoSQL graph database that can easily outperform any other relational and NoSQL system in providing customer insights and product trends.

A NoSQL database, popularly known as not only SQL