RFM analysis for Customer Segmentation

Ogulcan Ertunc
4 min readJan 25, 2021

Good marketers understand the importance of “getting to know your customer”. Marketers should follow the paradigm shift from increased CTRs (Click Through Rates) towards retention, loyalty, and customer relationships rather than just focusing on generating more clicks. Because it is easier to retain existing customers than to seek new customers

It would be quite wrong to analyze the entire customer base in the same way, engaging them with the same campaigns. Rather than segmenting the customer base into age and geographic segments, it is better to homogeneously segment groups according to their characteristics and engage them with related campaigns.

One of the most popular, easy-to-use, and effective segmentation methods that enable marketers to analyze customer behavior is RFM analysis.

What is RFM Analysis?

RFM stands for Recency, Frequency, and Monetary value, each corresponding to some key customer characteristics. These RFM metrics are important indicators of client’s behavior because the frequency and monetary value affect a customer’s lifetime value for the firm, and recency affects retention, a measure of interaction.

RFM Analysis Example

Data Source

For sample analysis, I used the “Online Retail II UCI” dataset available on Kaggle.

Importing Libraries and Data

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
import seaborn as sns
from operator import attrgetter
import matplotlib.colors as mcolors
from datetime import timedelta
import datetime as dt
df = df_.copy()
df.columns= df.columns.str.lower()
df.head()

First, try to recognize the dataset with the head function after importing it in order to understand it sketchy.

After that, let’s explore the main necessary information such as the most sold product and the most expensive product in this section.

Data Preprocessing

After all these processes, we need to calculate Recency Score, Frequency Score, and Monetary Score.

For these operations, we first need to understand the meanings of Recency, Frequency, and Monetary values.

The smaller the Recency value the more valuable it is to us, so let’s divide with the qcut function and the Recency column into as equal parts as possible so that the Recency Score is 5 when it gets the lowest value.

Frequency and Monetary values are assigned to Score values with the qcut function so that they get 5 when they get the highest values.

We need to determine the action types we will take in the future by segmentation according to the R-F-M Score values. In this context, let’s use the stereotyped segment names that are already used.

seg_map = {
r'[1-2][1-2]': 'Hibernating',
r'[1-2][3-4]': 'At_Risk',
r'[1-2]5': 'Cant_Loose',
r'3[1-2]': 'About_to_Sleep',
r'33': 'Need_Attention',
r'[3-4][4-5]': 'Loyal_Customers',
r'41': 'Promising',
r'51': 'New_Customers',
r'[4-5][2-3]': 'Potential_Loyalists',
r'5[4-5]': 'Champions'
}

In line with this process, we can think about the ratio of each segment, what kind of operations we will do on whom, and we can take our action quickly thanks to this segmentation.

Example Simple Action

At-Risk

That’s why the at_risk group is of great importance to us. Those who have shopped frequently in the past but haven’t had a shopping record recently are the ones that make up this category.
It is necessary to restore these people’s old habits. It is understandable that 13% of our customers are in this group, but it is acceptable to have a lower rate.
Special campaigns, effective communication, and attractive new methods can be applied to reduce this number.

I would like to express my sincere thanks to Vahit Keskin and my mentor Atilla Yardimci who helped and taught the completion of this project.

The notebook associated with this article is on Github if you want to follow along.

--

--

Ogulcan Ertunc

I’m an IT Consultant,graduated Data Analytics. I’m a Data Enthusiast 💻 😃 passionate about learning and working with new tech. https://github.com/ogulcanertunc