Market Basket Analysis using R

Market Basket Analysis with Groceries

Market Bask Analysis using R Studio (Example -Groceries Dataset) 

This reads: “When a user purchases a product in the item on the left side, the user probably purchases the product on the right. An instance that can be read more humanly is:

Three important proportions can be understood: support, confidence, and lift. I shall describe in the following bullet points the importance of these

  1. Support: The portion of our item set in our dataset occurs. Support tells us what percentage of transactions contain the combination of items A and B. It assists in identifying combinations that are frequent enough to be of interest (e.g., purchasing fish alone or purchasing fish and lemons together).

  2. Confidence: the probability that a rule is correct with items on the left for a new transaction. Confidence tells us what percentage of transactions with item A also have item B. (e.g., how many transactions that have bread also have butter).

  3. Lift: The ratio that exceeds the expected confidence by the confidence of the rule. The ratio of the number of respondents obtained with the model to the number obtained without the model is known as lift.

  • Lift (A => B) = 1 means that within a set of elements there is no correlation.

  • Lift (A = > B) > 1 means a positive correlation is more commonly purchased between the products in the product set, i.e. in items A and B.

  • Lift(A =>B) <1 means that it is unlikely to be purchased together for the negative correlation of the itemset, i.e. the products in the item set, A, and B.

Association Algorithms based on rules are seen as a two-step approach:

  • Generation frequent elements: Find all common item-sets with support >= min support count predetermined

  • Generation of Rule: List all Association Rules in frequent item sets. To calculate all the rules, support and trust. Take the rules that fail min support thresholds and min confidence.

Required Packages for the analysis###A-rules & A-rulesViz are required packages for Market Basket Analysis:library(arules)library(arulesViz)library(datasets)###Inbuilt Data available from the package datasets:data("Groceries")View("Groceries")itemFrequencyPlot(Groceries, topN=20, type= "absolute")class("Groceries")rules <- apriori(Groceries, parameter = list(supp = 0.001, conf = 0.8))rules <- sort(rules, by = "confidence", decreasing = TRUE)rules <- apriori(Groceries, parameter = list(supp = 0.001, conf = 0.8,maxlen=3))subset.matrix <- is.subset(rules, rules)subset.matrix[lower.tri(subset.matrix, diag=T)] <- NAredundant = 1rules.pruned <- rules[!redundant]rules<-rules.prunedrules<-apriori(data=Groceries, parameter=list(supp=0.001,conf = 0.08),              appearance = list(default="lhs",rhs="whole milk"),               control = list(verbose=F))rules<-sort(rules, decreasing=TRUE,by="confidence")inspect(rules[1:5])rules<-apriori(data=Groceries, parameter=list(supp=0.001,conf = 0.15,minlen=2),                appearance = list(default="rhs",lhs="whole milk"),               control = list(verbose=F))rules<-sort(rules, decreasing=TRUE,by="confidence")inspect(rules[1:5])plot(rules, method="graph")