More Examples and recipes¶
Example SQL statement to transform transactional data into RFM data¶
Let’s review what our variables mean:
frequencyrepresents the number of repeat purchases the customer has made. This means that it’s one less than the total number of purchases. This is actually slightly wrong. It’s the count of distinct time periods the customer had a purchase in. So if using days as units, then it’s the count of distinct days the customer had a purchase on.
Trepresents the age of the customer in whatever time units chosen. This is equal to the duration between a customer’s first purchase and the end of the period under study.
recencyrepresents the age of the customer when they made their most recent purchases. This is equal to the duration between a customer’s first purchase and their latest purchase. (Thus if they have made only 1 purchase, the recency is 0.)
Thus, executing a query against a transactional dataset, called
orders, in a SQL-store may look like:
SELECT customer_id, COUNT(distinct date(transaction_at)) - 1 as frequency, datediff('day', MIN(transaction_at), MAX(transaction_at)) as recency, datediff('day', CURRENT_DATE, MIN(transaction_at)) as T FROM orders GROUP BY customer_id
Create table with RFM summary matrix with holdout¶
recency have the same meaning as in previous section.
Two variables to set before executing:
duration_holdout- holdout duration in days.
CURRENT_DATE- current date, could be changed to final date of the transactional data.
select a.*, COALESCE(b.frequency_holdout, 0) as frequency_holdout, duration_holdout as duration_holdout from ( select customer_id, datediff(max(event_date), min(event_date)) as recency, count(*) - 1 as frequency, datediff(date_sub(CURRENT_DATE, duration_holdout), min(event_date)) as T from orders where event_date < date_sub(CURRENT_DATE, duration_holdout) group by customer_id ) a left join ( select customer_id, count(*) as frequency_holdout from orders where event_date >= date_sub(CURRENT_DATE, duration_holdout) and event_date < CURRENT_DATE group by customer_id ) b on a.customer_id = b.customer_id