Personalized Recommendation Engine for Restaurants and dishes


The objective of this article is to propose a design of a personalized recommendations application for restaurants owners which can be used by the customers of the given credit/debit card issuer. The developed recommendation application can be installed as mobile application in the mobile phone of the customers. The recommendation engine will use machine learning algorithms and predictive analytics techniques to derive information from  the credit transaction data, mobile location data, social media data (Facebook, Twitter), public data (Yelp.com/Bundle.com etc.), mobile browser data and data of the restaurants*.

The recommendation engine will have two modules. First module will use the clustering algorithm to derive actionable information from credit card data and restaurants data. It will also use the data mining techniques to integrate the useful information available in the web space with restaurant data (explained later). This module will execute in server of the bank   periodically.  Decision Science analyst of the bank will have ability to improve this module whenever it is required.   Second module will have ability to access the data from social media, browser, wish list and mobile location data and output of the module 1. Also use the machine learning algorithm to learn about the personality of the customer using the data those can be accessed by this module. Additionally user can control the use of this module by enabling and disabling it. 

The recommendation system also has ability to learn over the period by the activities and visits of the customer to the recommended restaurants and improvises the recommendation.

 Section 1 describes the sources of the data to be used in the system. Section 2 discusses optimization of the data sources integration. Section 3 discusses the analytical methods to be used to conclude the factors influencing the customer’s decision. Section 4 discusses the machine learning algorithm used to model the recommendation engine.  Section 5 describes how the use of learning algorithm allows the recommendation system to learn by itself over the time. Then finally, section 6 discusses the various scenarios where customer can use the recommendation system.

Section 1
Data from various sources is required to integrate to develop the robust recommendation engine. This section describes the major data sources which can be used
1.   Credit Card Transaction Data : Analysis of credit card transaction data will be carried out  to calculate the average ticket size of the transaction, kind of restaurants visited (Indian, continental, Chinese, Italian, multi cuisine, family restaurants vs Bar), frequency of visits, location of restaurants, number of  days between two successive visit, time of visit (breakfast, lunch, dinner etc.). This information will be obtained from the credit card transaction data and will be summarized at customer level with customer id (or credit card number) as primary key. Also a table will be created for list of restaurants customer has already visited. The decision on the number of months of data to use for creating above variables will be taken based on exploratory data analysis of transaction data and customer demographics.  Assuming that based exploratory analysis of data 3 months of transaction data is sufficient of the robust recommendation engine.

2.   Mobile Location Data:  In this age most of the mobiles have GPRS system using which we can track the customer movements. The time spent in any restaurant indicates customers’ interests in the restaurant. Time spent can be tracked using mobile location data. Current location of the customer will be used to recommend the customers the restaurants with distance information or only nearby restaurants rather than recommending restaurants which are located a distant. The location data will be also used to know whether customer is travelling by car (or other vehicle) or by public transports or by walk. The speed with which customer moves from one place to another surrogates his mode of convenience.

3.   Restaurant Data: The information of different restaurants is available in their website and in review websites like zomato.com. Using these information a data base will be created which will contain following information:

Name of restaurants, website link, type of restaurants, Menu, Dishes famous for, Location, timing, happy hour timings, reviews and rating on website like yelp.com, zomato.com  etc.

The database can be created and managed by the bank itself and can be periodically refreshed or Bank can tie up with restaurants for updating their data. Restaurants can easily be convinced for this as implementation of recommendation engine will increase the footfall in the restaurants.
As this data cannot be exhaustive hence on few occasions, data of the restaurants in web space will be accessed real time.

4.   Public data: Publicly available data in websites like yelp.com, zomato.com, justeat.in, Bundle.com etc. Reviews and rating regarding dishes, services and ambiance of the restaurants will be used to carry out the sentimental analysis of the restaurants and will be summarized at restaurants level. The summary will be integrated with the Restaurants data. These data will be refreshed periodically

5.   Social Media data: Customer information from social media can used to predict the persona of the customer which will be an input in recommendation of the restaurants. As the social media data is not available publicly when customer installs the recommendation application in his/her mobile a pop up will ask to access the social media data. For customer who does give access while application installation will get the message each time he/she uses the application with some charismatic lines as “ For improved recommendation please share Facebook/twitter data” etc.

6.   Customer wish list: Recommendation engine will also request to maintain the wish list and will be stored in mobile. Engine will pop up periodically to request customer to update the wish list. Customer can create wish list of restaurants, places to visit, dishes to try, adventure sports, electronic gadgets, cars etc. These data will be used to study the personality of the customer.


7.   Mobile browser search data: Customer are using mobile for internet browsing and search related to restaurants or food joint will be accessed by the application to improve the recommendation system. 

Section 2:
Data sources discussed in the section 1 will be summarized at different levels for example credit card transaction data will be summarized at customer level and also a list of restaurants will be created. Restaurant data will be created at restaurant level with its all attribute information. Social Media data and customer wish list will be used to define the personality of the customer and summarized information will be stored in mobile. A logical data model will be designed using which data sources will be integrated by customer as well as restaurants. Data from sources like Social media data, customer wish list will be present in mobile only or will be accessed real time.

Data from public websites like yelp.com, Bundle.com etc. will be integrated by restaurants with the restaurant data table. Credit card transaction data will be integrated with restaurant data table by restaurants while with social media data by customer.   

Section 3
This section discusses how the data sources mentioned in previous section are useful for decision making process of the recommendation system. I discuss which data will be used to derive which specific information of the customer.
Credit card transaction data will be used to identify transaction pattern of the customers at restaurants and food joints.  Information like his/her transaction ticket size and frequency of visits will be used to create value of the customers in terms of high spender, medium spender or low spenders at food points.  FRM metric will be created to define these categories where
 FRM (Frequency, Recency, Monetary) 
= (Frequency of restaurants visits    X   Amount Spent)/(No of days since last visit)
Type restaurants visited by the customer will be provide the information of the customer’s taste choice, price range choice, timings, weekends, day of visits (special day), preference etc. These attributes will be used to identify list of restaurants where customer would like to visit after considering current location and customer wish list and customer persona. Type of restaurants visited also tells which dishes customer likes. If the transaction value has no pattern means customer likes to try different dishes at the restaurants assuming he/she is visiting alone.

Customers visiting websites like yelp.com, zomato.com are more sensitive to reviews, ratings and pricing, ambiance. Those who writes reviews also likes to share their views with others which is in fact inadvertently a promotion (good reviews) or discouragement (bad reviews)

Customer wish list tells about his aspirations and taste. Variety of restaurants wish list contributes to the fact that customer likes to try out different restaurants. Presence of healthy food items reveals he/she is health conscious. This also tells about his/her eating habits.

Data from social media presents the customers personality. Frequent twits or status updates, photo uploads tells that he/she is open person. Likes of Facebook pages tells about his/her likes.  No of friends tells that he/she is friendly person.

Mobile location data tells about the customers travel preferences. Visits to different places means he/she likes visiting new places. Frequent search of restaurants on mobile browser reveals that customer is not happy with recommendation system or he/she is not using it.

Section 4

Using the variable frequency of transactions to restaurants by credit cards customers   group of customers will be created. First group will be group of customers those visits restaurants frequently. There will two more groups which will be less frequent visitors and never visited.  Using the customers transactions pattern over the 3 month period , balance pattern and demographics  a look alike model will be created to identify customers who are similar to frequent visitors in all attributes but visits to restaurants. This will enable the bank to tap the customers who are actually visiting the restaurants but using cards of other bank.
Total Customer target base= Frequent visitors + Look alike customers

Using this customer base develop the clusters of customers using K means clustering technique. The variables related to restaurant transaction will not be used for clustering but other attributes will be. The restaurant transaction variables will be used to create cluster profiles like his/her taste choice, ticket size, restaurant type choice, frequency, timings, outing and special days, holiday places likes to visit, price sensitivity etc. Such clusters profiling will enable to identify restaurants preference factors  for any customer have not used credit card of this bank in the restaurants basis  the cluster profile to which he/she belongs. Let’s call these attributes customer’s restaurants and dishes selection attributes.

Restaurant data will be used to create clusters of the restaurants using clustering technique.  The output clusters will have homogeneous restaurants in same cluster but heterogeneous in different cluster. The location of the restaurant will not be used as clustering variables but price range, type of cuisine, ambiance, theme, dishes famous for, ratings, happy hour timings, discounts and offers etc. Using the   Customer’s restaurants and dishes selection attributes from the clusters of restaurants a set of initial recommendation of restaurants will be generated.

The first set of recommended restaurants will further pass through a filter of distance from the current position of customer tracked by GPRS system of the smart phone.  This will also suggest time to reach to restaurant in current traffic situation based on Google map data and also route for car, public transport or by walk.  But there is always possibility that customer is in another country and only few restaurants from that country have been detailed in restaurant data table. In that case customers’ attributes of restaurants selection will be used to search the restaurants in web space and public review sites like bundle.com, yelp.com. This will generate recommendations of restaurants keeping the location/place of the customer as one of the criteria.

These restaurants recommended will also pass through optional filter of customers’ wish list. Any recommended restaurants till and also on the customer wish list will be given higher priority compared to others. Additional filter of customer’s persona will apply on these recommended restaurants. The persona information will be stored in mobile as the machine learning algorithm will run at the back end of engine to update the persona information of the customers on weekly basis.

User of this recommendation system presently credit card customer of the bank will have some controlling power to generate the recommendation system for different case. Some of these cases have been discussed in section 6. Customer will be able to disable the location filter/wish list or personality filter of the recommendation engine.  The recommendation system will have ability to download the customers’ restaurants selection attribute information in mobile.  Recommendation engine will be coded in such a way that it can access the restaurants cluster data on real time basis. 





Flow of the recommendation Engine can be explained by pictorial model below   

Bank Space: Data stored at bank’s storage server and server executing the clustering algorithms. Credit card data is refreshed periodically by the bank. Restaurant data is updated by the bank itself or has tie up with listed restaurants.

Web Space: Public data like yelp.com, zomato.com and restaurants websites etc. Data mining algorithms text mining techniques executes at Bank’s server and derives the information post sentimental analysis. The resultant is integrated with restaurants data by restaurants.
Initial set of recommended restaurants is generated by overlaying customers’ restaurants and dishes selection attributes and attributes of the restaurants. 



Section 5:
I am proposing a recommendation engine which will be using the dynamic data rather than a static data. Three months data of credit card transaction will be refreshed monthly so the customers’ restaurants and dishes selection attributes will be updated on monthly basis. Also restaurants data table will be refreshed periodic basis and as this table also uses information (ratings and review etc) from public websites so over the time more homogenous cluster of the restaurants will be created.
Customers’ feedback is tracked indirectly with smart phone as mobile can track which restaurants customer has visited out of all the recommendation based on location data and total time spent there. More time spending in particular type of the restaurants also reveals his choice. Using the machine learning algorithm over the time engine can fine tune the choices of the customer. If customer visits none of the recommended restaurants and uses browser to search restaurants further. It means engine could not recommend any restaurants of his/her choice. The attributes of the visited restaurants will be considered in future recommendation which will be done by using machine learning algorithm and use of credit card data.

Customer wish list will pop up periodically for updating. The machine learning algorithm developed for defining persona of the customer will learn more about the customer over the time with the twits, status updates, likes, photo sharing, time spent on social media , wish list etc.  
  
Bank must develop a robust plan to deploy a system which can house the information derived by running clustering algorithm on credit card transaction data and restaurant data. I have named this as Bank space. In my design of recommendation engine for best personalized recommendation I am proposing to use the processor of the mobile phone to execute the module 1 of the engine. Processor will be also used to access the customer’s social media data and movements (location). This system provides mechanism to the engine to learn over the time. Another challenge in implementation may be round the clock access of summarized output of clustering algorithm deployed in server by application installed in the mobile phone.  This challenge can overcome by downloading the attributes of restaurants selection of customer and information of initial set of recommended restaurants in mobile automatically.
Section 6.
A customer can use the recommendation system in different ways. Considering cases one by one where the recommendation system will help the customer with the example of Mr John.
1.   John on a Sunday evening went to see an art exhibition and while returning wants have dinner but he is not quite familiar with the locality. He uses the recommendation application and set of restaurants nearby with map is recommended. These restaurants are selected based Mr John’s restaurants and dishes selection attributes and persona. He also finds in result that this is busy hour for these restaurants so he thinks while going back home he look for more recommendation and will go where he can have dinner quickly.
2.   Next Monday John will be turning 31. He wants to through a party to his colleagues. He is confused whether he should go out for dinner or lunch. Keeping that he is giving party to his colleagues he switches of the personality and his personal attributes of restaurants. The recommendation system generates set of restaurants with information of type of cuisine and best time to visit.  
3.   John is in business trip to France.  He uses recommendation system to find the restaurant which he can enjoy. As the restaurants data does not have data of restaurants in France the initial internal recommendation is null but by using the John’s restaurants and dishes selection attributes and persona information recommendation engine searches for the local restaurants. The results are 2 Chinese and 3 Indian restaurants with metropolitan ambience in distance of 2 miles. As John is Indian so he has frequently visited Indian and Chinese restaurants in other countries also (not only in India) which is captured in attributes.

The process flow of the design clearly explains which data sources should be used to derive what information and how the personalized recommendation can be generated. This design can be implemented using any programming language. 
  

Comments

Popular posts from this blog

How to check whether a SAS dataset exist or not and throw an error in the log ?

Solution for ERROR: Some character data was lost during transcoding in the dataset

2018 plan for getting expertise in Machine Learning and Deep Learning