Maximizing Profits with Machine Learning: A Guide to Retail Price Optimization

15 min readMar 25, 2024

Art and science of leveraging the power of machine learning to assist your retail price optimization strategy.

Machine Learning Model for Retail Price Optimization (created by Dalle-3)

Introduction

In the fiercely competitive world of retail, setting the right price for your products can make all the difference between thriving and barely surviving. However, finding that sweet spot is easier said than done. With countless factors to consider, from consumer demand and market trends to competitor pricing and inventory levels, determining the optimal price point can feel like navigating a minefield.

This is where machine learning (ML) comes to the rescue. By leveraging the power of advanced algorithms and vast amounts of data, retailers can now predict consumer behavior and market dynamics with unprecedented accuracy. ML models, such as the ExpectileGAM, are revolutionizing the way businesses approach pricing strategies, enabling them to maximize profits and stay ahead of the competition.

In this blog post, we’ll dive deep into the concept of retail price optimization and explore how the ExpectileGAM model can be applied to a real-world retail dataset. We’ll walk you through the process step-by-step, from understanding the underlying principles to interpreting the results and making data-driven business decisions.

Whether you’re a seasoned retailer looking to optimize your pricing strategy or a data science enthusiast eager to learn about the real-world ML applications, this post has something for you. So, grab a cup of coffee, sit back, and let’s embark on a journey to uncover the secrets of retail price optimization with machine learning.

2. Price Optimization Framework: A Three-Pronged Approach to Maximizing Profits

When it comes to setting prices in the retail industry, relying on gut feelings or traditional methods is no longer enough. To stay competitive and maximize profits, retailers must adopt a data-driven approach that leverages the power of business analytics. This is where the price optimization framework comes into play.

The price optimization framework consists of three common stages of business analytics: descriptive analysis, predictive analysis, and prescriptive analysis. Each stage plays a crucial role in helping retailers make informed decisions about pricing strategies. Let’s take a closer look at each one.

2.1. Descriptive Analysis: Understanding the Past to Inform the Future

Descriptive analysis involves diving deep into historical sales and pricing data to uncover trends, patterns, and relationships between various factors, such as price, sales volume, and market conditions.

Retailers use techniques like data aggregation, data transformation, visualization, and statistical analysis to gain a comprehensive understanding of the market and customer behavior. By examining past data, they can identify key insights that will inform the development of pricing strategies moving forward.

2.2. Predictive Analysis: Forecasting the Impact of Pricing Strategies

Once retailers have a solid understanding of historical data, they move on to the predictive analysis stage. This is where the real magic of machine learning comes into play.

In this stage, retailers build statistical or machine learning models, such as the ExpectileGAM model, to predict how changes in price will impact sales volume, revenue, and profit. By training these models on historical data, retailers can estimate the price demand elasticity, or the relationship between price changes and sales volume.

Armed with this information, retailers can make informed decisions about pricing strategies, knowing how each change is likely to affect their bottom line.

2.3. Prescriptive Analysis: Optimizing Prices for Maximum Profit

The final stage of the price optimization framework is prescriptive analysis. This is where retailers put all the insights and predictions from the previous stages into action.

Using mathematical programming techniques like linear programming, integer programming, or dynamic programming, retailers can determine the optimal pricing strategies that maximize revenue, profit, or other business objectives. These techniques take into account constraints such as customer satisfaction, market share, and competition, ensuring that the chosen pricing strategy is not only profitable but also sustainable in the long run.

By combining predictive models with optimization algorithms, retailers can make data-driven decisions that balance short-term revenue with long-term strategic goals.

2.4. The Power of Integration: Bringing It All Together

While each stage of the price optimization framework is valuable on its own, the real power lies in integrating all three into a comprehensive pricing strategy. By leveraging descriptive, predictive, and prescriptive analytics, retailers can make informed, data-driven decisions that maximize revenue, profit, and customer satisfaction.

In the following sections, we’ll explore how the ExpectileGAM model can be applied to a real-world retail dataset, walking you through the process step-by-step. Get ready to see the price optimization framework in action!

3. From Data to Decisions: Applying the Retail Price Optimization Framework to an iPhone Accessory Retailer

3.1. Descriptive Analysis: Uncovering Insights from Historical Data

In the world of retail price optimization, data is king. To make informed decisions about pricing strategies, retailers must first understand the patterns and trends hidden within their historical data. This is where descriptive analysis comes into play.

To illustrate the process of building a machine learning (ML) model for retail price optimization, let’s consider a real-world use case. Imagine a retailer specializing in iPhone accessories, selling four types of products:

Standard case for iPhone 15 Pro
Standard case for iPhone 15 Pro Max
Premium case for iPhone 15 Pro
Premium case for iPhone 15 Pro Max

The objective is to analyze historical data and recommend a price optimization strategy to the company’s leadership. As the first stage of the analytics process, we begin with descriptive analysis.

Step 1: Data Loading and Transformation

The first step in descriptive analysis is to load the input data from a CSV file containing historical price, quantity sold, product name, and associated events. This data forms the foundation for our analysis.

# Load data
data = pd.read_csv('data/price_optimization.csv')
data['is_event'] = (data['event'] != "No Promo").astype(int).astype(str)
data['revenue'] = data['price'] * data['quantity_sold']
data.head()

Next, we perform data transformation to create derived features that will provide additional insights. For example, we can create an ‘is_event’ feature to indicate whether a sale event occurred and calculate the ‘revenue’ by multiplying the price by the quantity sold.

The sample data looks as below:

Step 2: Visualizing Price Distribution with ggplot

To gain a deeper understanding of the pricing patterns for each product, we can visualize the price distribution using a powerful data visualization tool. In this use case, we’ll utilize the ggplot library from the plotnine package in Python.

# distribution plot of price by product: two by two grid
(  ggplot(data, aes(x='price', fill='product')) +
    geom_density(alpha=0.6) +
    facet_wrap('~product', ncol=2) +
    theme_minimal() +
    theme(figure_size=(10, 5)) +  # Adjust the width and height here
    labs(title='Price Distribution by Product', x='Price', y='Density', fill='Product'))

By leveraging ggplot, we can easily create density curves to represent the price distribution for each product as in Figure 1. The density curve visualization provides valuable insights into the pricing strategies for each iPhone accessory product.

By examining the price distribution chart, we can make several key observations:

Premium vs. Standard Cases: The density curves for premium cases are positioned further to the right compared to the standard cases, confirming that premium cases are priced higher for both iPhone 15 Pro and Pro Max models. This aligns with the expected price differentiation between premium and standard products.
iPhone Models Comparison: The density curves for the iPhone 15 Pro and Pro Max cases have similar shapes and positions for both standard and premium cases. This suggests that the pricing strategies for accessories are consistent across the two iPhone models, without significant differences in pricing distributions.
Price Variability: The density curves for all products are relatively narrow, indicating that prices are fairly consistent within each product category. The premium cases have slightly wider curves compared to the standard cases, suggesting a bit more price variability for premium products. However, overall, the pricing appears to be quite stable and predictable for each accessory type.
Market Strategy Insights: The visualization reveals a clear separation between the pricing of standard and premium cases, with premium products occupying a higher price range. This suggests a market segmentation strategy, where the retailer offers distinct product tiers to cater to different customer preferences and willingness to pay. The relatively narrow curves also imply a focused pricing strategy within each category, aimed at maintaining consistent prices and avoiding frequent fluctuations.

Step 3: Unveiling Demand Elasticity Insights

Next, we use ‘scatter plot’ to visualize the data. In this scatter plot chart, we use color scheme to indicate the product and add a trend line for each product. The trendline can serve as a proxy for the demand curve, with a steeper slope indicating greater price elasticity of demand.

import plotly.express as px
px.scatter(data, x='price', y='quantity_sold', color='product', trendline='lowess', width=800, height=600, title='Price vs Quantity Sold')

Key observations from the chart:

The standard case for the iPhone 15 Pro (green) has the steepest trendline, indicating the highest price elasticity of demand. This suggests that customers are most responsive to price changes for this product.
The premium case for the iPhone 15 Pro (purple) has the flattest trendline, implying the lowest price elasticity of demand. Customers are less sensitive to price changes for this product, giving the retailer more flexibility in pricing.
The standard and premium cases for the iPhone 15 Pro Max (red and blue, respectively) fall in between, with moderate price elasticity of demand.

These insights into price elasticity are crucial for developing optimal pricing strategies. For products with high elasticity, retailers must be cautious when adjusting prices, as small changes can significantly impact demand. Conversely, products with low elasticity allow for more pricing flexibility without substantially affecting sales.

By leveraging these insights alongside other market factors, retailers can make data-driven decisions to maximize profitability and customer satisfaction in their iPhone accessory product lines.

3.2. Predictive Analysis: Harnessing the Power of ExpectileGAM for Price Optimization

In the realm of retail price optimization, predictive analysis plays a pivotal role in unlocking actionable insights and driving data-driven decision-making. There are many ML models suitable for the retail price optimization. In this blog post, we will introduce you to a powerful tool in our arsenal: ExpectileGAM, a type of Generalized Additive Model (GAM), that has proven to be a game-changer in our use case.

As a type of Generalized Additive Model (GAM), ExpectileGAM is a powerful predictive modeling technique that combines the strengths of both linear regression and tree-based models. GAMs are an extension of generalized linear models (GLMs) that allow for more flexible and complex relationships between the response variable and predictors.

ExpectileGAM, in particular, offers several unique advantages for retail price optimization:

Quantile Regression on Steroids: ExpectileGAM is a form of quantile regression, which means it excels at estimating conditional quantiles of the response variable. This is particularly useful for retail price optimization, as it enables businesses to model various quantiles of the demand distribution, thus capturing the entire spectrum of customer demand.
Non-linear Relationships: ExpectileGAM allows for the modeling of non-linear relationships between the response variable and predictors, providing a more accurate representation of real-world dynamics compared to traditional linear regression models.
Interpretability: Despite its powerful predictive capabilities, ExpectileGAM remains highly interpretable, allowing data scientists and business stakeholders to gain valuable insights into the relationship between predictors and the response variable.
Robustness: ExpectileGAM is robust to outliers and heavy-tailed distributions, which are common in retail data, making it a reliable choice for price optimization.
Efficiency: ExpectileGAM can efficiently handle large datasets and high-dimensional feature spaces, making it an ideal choice for retail price optimization in big data scenarios.

The general structure of a GAM can be represented by the equation:

g(E(y)) = β0 + f1(x1) + f2(x2) + … + fn(xn)

Here, g(E(y)) represents the link function applied to the expected value of the response variable, β0 is the intercept, and f1 through fn are smooth functions applied to predictor variables x1 through xn. These smooth functions enable GAMs to capture the intricate, non-linear relationships that often exist in real-world data.

The Python code of building the ExpectileGAM model is provided as below:

from sklearn.preprocessing import LabelEncoder
from joblib import dump
import pandas as pd

# Assuming 'data' is a pandas DataFrame that includes 'product' and 'event' columns
df = data.copy()
# Create a label encoder for each categorical feature
product_encoder = LabelEncoder()
event_encoder = LabelEncoder()
# Fit the label encoders to the categories
df['product'] = product_encoder.fit_transform(df['product'])
df['event'] = event_encoder.fit_transform(df['event'])
# Save the encoders to file
dump(product_encoder, 'product_encoder.joblib')
dump(event_encoder, 'event_encoder.joblib')

X = df[['price',  'product', 'event']].copy()
y = df[['quantity_sold']].copy()

# split X and y into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

gams = []
model_train_performance = []
model_test_performance = []
df_gam_results = pd.DataFrame()
quantiles = [0.025, 0.5, 0.975]
for q in quantiles:
    # fit the model on the training data
    gam = ExpectileGAM(s(0)+f(1)+f(2), expectile=q).fit(X_train, y_train)   
    # save the gam model and the quantile
    dump(gam, f'models/gam_model_quant_{q}.joblib')
    gams.append({"q": q, "gam_model": gam})
    # predict on the test data
    y_train_pred = gam.predict(X_train)
    y_test_pred = gam.predict(X_test)
    y_pred = gam.predict(X)
    # calculate the performance of the model on the test data
    train_mse = mean_squared_error(y_train, y_train_pred)
    test_mse = mean_squared_error(y_test, y_test_pred)
    # calculate R-squared
    train_r2 = r2_score(y_train, y_train_pred)
    test_r2 = r2_score(y_test, y_test_pred) 
    model_train_performance.append({"q": q, "mse": train_mse, "r2": train_r2})
    model_test_performance.append({"q": q, "mse": test_mse, "r2": test_r2})
    df_gam_results[f"pred_quant_{q}"] = y_pred

The R2 for quantile = 50% is 0.92, indicating the model performs well in explaining the variance of sales quantity.

We can leverage the predictive model for further insights. For now, we will use the following Python code snippets to visualize the model results:

from plotnine import ggplot, aes, geom_ribbon, geom_point, geom_line, facet_wrap, labs, scale_color_manual, theme

# Assuming 'tk.palette_timetk().values()' returns a list of color values
# We'll create a list manually for this example
color_values = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b']


# Updated ggplot visualization
(ggplot(df_gam_predictions[df_gam_predictions['event'] == "No Promo"], 
        aes(x='price', y='quantity_sold', color='product', group='product'))
 + geom_ribbon(aes(ymax='pred_quant_0.975', ymin='pred_quant_0.025'), 
               fill="#d3d3d3", color="None", alpha=0.5)
 + geom_point(alpha=0.5)
 + geom_line(aes(y='pred_quant_0.5'), color="blue")
 + facet_wrap('~product', scales="free")
 + labs(title="GAM Price vs Quantity Model")
 + scale_color_manual(values=color_values)
 + theme(figure_size=(10, 6))
)

The generated chart looks as below:

In this chart, we visualize the predicted median quantity sold for each product at different price points, specifically for the ‘no promotion’ segment. The blue lines represent the predicted values based on the trained ExpectileGAM model.

We can use the following code snipets to visualize how the event type impact the price elasticity:

(ggplot(
        data = df_gam_predictions.loc[df_gam_predictions['is_event'] == '1'],
        mapping = aes(x='price', y='quantity_sold', color='product', group='product')) + \
    geom_point(alpha=0.5) + \
    geom_line(aes(y = "pred_quant_0.5"), color = "blue") + \
    facet_grid('product ~ event', scales = "free") + \
    labs(title = "[Special Events] GAM Price vs Quantity Model") + \
    scale_color_manual(values = list(tk.palette_timetk().values())) + \
    tk.theme_timetk(width = 800, height = 1100) 
)

The generated chart looks as below:

The chart above shows that different types of events impact the price in various ways.

3.3. Prescriptive Analysis: Optimizing Prices for Maximum Revenue

In the final stage of our retail price optimization journey, we dive into prescriptive analysis, where we leverage the insights gained from our predictive models to make data-driven decisions. At the heart of this process lies the price optimization algorithm, a powerful tool that helps us determine the optimal prices for each product to maximize revenue.

Quantile-Based Optimization: Preparing for Every Scenario

The price optimization algorithm takes into account different quantile levels of the predicted revenue, enabling us to consider various market scenarios. By optimizing prices at the 50th, 97.5th, and 2.5th quantiles, we can identify the prices that are expected to generate the highest revenue under different conditions.

The 50th quantile represents the median or expected revenue, providing a balanced view of the potential outcome. The 97.5th quantile, on the other hand, represents the best-case scenario, where we aim to capitalize on favorable market conditions. Conversely, the 2.5th quantile represents the worst-case scenario, allowing us to make informed decisions that mitigate potential risks.

# Optimize Price for Predicted Daily Revenue
for col in df_gam_predictions.columns:
    if col.startswith('pred_quant_'):
        q = str(col.split('_')[2])
        print(q)
        df_gam_predictions['revenue_pred_' + q] = df_gam_predictions['price'] * df_gam_predictions[col]

best_50 = df_gam_predictions.loc[df_gam_predictions['event'] == "No Promo"] \
    .groupby('product') \
    .apply(lambda x: x[x['revenue_pred_0.5'] == x['revenue_pred_0.5'].max()].head(1)) \
    .reset_index(level=0, drop=True)
    
best_975 =  df_gam_predictions.loc[df_gam_predictions['event'] == "No Promo"] \
    .groupby('product') \
    .apply(lambda x: x[x['revenue_pred_0.975'] == x['revenue_pred_0.975'].max()].head(1)) \
    .reset_index(level=0, drop=True)
    
best_025 =  df_gam_predictions.loc[df_gam_predictions['event'] == "No Promo"] \
    .groupby('product') \
    .apply(lambda x: x[x['revenue_pred_0.025'] == x['revenue_pred_0.025'].max()].head(1)) \
    .reset_index(level=0, drop=True)

Visualizing the Optimization Results: A Clear Path Forward

To gain a clear understanding of the optimization results, we once again turn to the power of visualization using the ggplot library from the plotnine package. The resulting chart showcases the optimized revenue for each product, represented by red dots, along with the upper and lower bounds, depicted by blue dots.

# Visualize the GAM Revenue Optimization Results

(
    ggplot(
        data = df_gam_predictions.loc[df_gam_predictions['event'] == "No Promo"],
        mapping = aes(x='price', y='revenue_pred_0.5', color='product', group = 'product'),
    ) + \
    geom_ribbon(aes(ymax = "revenue_pred_0.975", ymin = "revenue_pred_0.025"), fill = "#d3d3d3", color = "#FF000000", alpha = 0.5, show_legend = False) + \
    # Uncomment to add actual revenue points
    geom_point(aes(y='revenue_actual'), alpha=0.15, color ="#2C3E50") + \
    geom_line(aes(y='revenue_pred_0.5'), alpha=0.5) + \
    geom_point(data = best_50, color = "red") + \
    geom_point(data=best_975, mapping = aes(y = 'revenue_pred_0.975'), color = "blue") + \
    geom_point(data=best_025, mapping = aes(y = 'revenue_pred_0.025'), color = "blue") + \
    facet_wrap('product', scales="free") + \
    labs(
        title = "iPhone Case Price Optimization",
        subtitle = "Maximum Median Revenue (Red) vs 95% Maximum Confidence Interval (Blue)",
        x = "Price",
        y = "Predicted Revenue"
    ) + \
    scale_color_manual(values = list(tk.palette_timetk().values())) + \
    tk.theme_timetk(width = 800, height = 600) 
)

Business Insights for the retail price optimization: The visualization provides valuable insights into the optimal pricing strategies for each product. For the Premium Case and Standard Case of the iPhone 15 Pro Max, the optimal prices (red dot) lie at the ends of the price range. This suggests that the company should consider a price increase for these models to maximize revenue potential.

On the other hand, for the Premium Case and Standard Case of the iPhone 15 Pro, the optimal prices (red dot) are situated in the middle of the price range. This indicates that the company should aim for prices close to the red dots to strike a balance between revenue optimization and market competitiveness.

Next Steps: Expanding the Horizons of Retail Price Optimization

In our current solution, we have developed a robust predictive model and performed calculations to obtain directional insights for price changes. While this is a significant step forward, it’s important to acknowledge that there is more work to be done to scale the retail price optimization strategy:

Expand product coverage: Include a wider range of products in the optimization process to ensure a comprehensive pricing strategy across the entire product portfolio.
Enrich feature set: Incorporate additional predictor features that capture relevant factors influencing pricing decisions, such as seasonality, customer demographics, and market trends.
Enhance model performance: Fine-tune the models through hyperparameter optimization, feature engineering, and model ensemble techniques to improve predictive accuracy and robustness.
Implement dynamic pricing: Optimize prices based on decision windows, such as weekly pricing for offline retailers, to adapt to changing market conditions and customer demand.

To achieve a comprehensive price optimization process that takes into account the myriad of factors influencing the retail landscape, we will focus on several key components:

Advanced Machine Learning Techniques: Explore and incorporate advanced machine learning techniques, such as deep learning, reinforcement learning, and ensemble methods. These techniques will enable us to capture the intricate relationships between various factors and pricing outcomes, allowing us to build sophisticated models that adapt to the ever-changing retail environment.
Real-Time Data Integration: Integrate our models with live data streams from various sources, including sales data, customer feedback, social media sentiment, and competitor pricing information. By leveraging real-time data, we can make informed pricing decisions on the fly and quickly respond to market changes.
Scenario Planning and Simulation: Develop scenario planning and simulation capabilities to test the impact of different pricing strategies under various market conditions. By running simulations and analyzing the results, we can identify the most effective pricing approaches for each product and market segment while assessing potential risks and rewards.
Continuous Learning and Adaptation: Implement continuous learning mechanisms that allow our models to adapt and improve over time. By constantly feeding new data into our models and refining our algorithms, we can ensure that our pricing strategies remain relevant and effective in the face of changing market dynamics.

In future blog posts, I will dive deeper into the implementation of a comprehensive price optimization process, exploring the advanced techniques and approaches that will shape the future of retail pricing.

If you found this content valuable, please like and follow me for more insights on business analytics and data-driven decision-making. Thank you for your support!

Maximizing Profits with Machine Learning: A Guide to Retail Price Optimization

Introduction

2. Price Optimization Framework: A Three-Pronged Approach to Maximizing Profits

3. From Data to Decisions: Applying the Retail Price Optimization Framework to an iPhone Accessory Retailer

Next Steps: Expanding the Horizons of Retail Price Optimization

Written by Phillip Peng