Step-by-Step Case Study: Supply Chain Risk Analysis in Engineering Using Logistic Regression in Python

In today’s rapidly evolving industrial landscape, managing supply chain risks is critical for maintaining operational resilience and optimizing logistics networks. This case study outlines a practical approach to tackling Supply Chain Risk Analysis—an engineering challenge—using supervised machine learning with logistic regression. It showcases how Python can be leveraged to assess risks and ultimately optimize supply chain logistics.

Understanding the Problem

Supply Chain Risk Analysis involves identifying vulnerabilities in the flow of materials, information, and finances across the network of suppliers, manufacturers, and distributors. Potential risks include supplier delays, demand fluctuations, geopolitical issues, or natural disasters. The goal is to predict risk occurrences and minimize disruptions.

Step 1: Defining Objectives and Collecting Data

The objective is to develop a predictive model that classifies supply chain nodes or routes as 'High Risk' or 'Low Risk' based on historical operational data.

Sample Dataset

SupplierID	Avg_Delay_Days	Delivery_Reliability (%)	Geopolitical_Risk_Score	Inventory_Level	RiskLabel
S1	5	90	3	150	0
S2	10	70	8	80	1
S3	2	95	2	200	0
S4	15	60	9	50	1
S5	7	85	5	120	0

RiskLabel: 0 = Low Risk, 1 = High Risk

Step 2: Data Preprocessing and Feature Engineering

Clean missing values, normalize, and encode data as needed with pandas and NumPy.

Step 3: Splitting Data for Model Training and Testing

We split the dataset into training and testing sets:


python
from sklearn.model_selection import train_test_split

X = data[['Avg_Delay_Days', 'Delivery_Reliability (%)', 'Geopolitical_Risk_Score', 'Inventory_Level']]
y = data['RiskLabel']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Step 4: Building the Logistic Regression Model

Fit the logistic regression classifier:


python
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X_train, y_train)

Logistic Regression Equation

The logistic regression predicts the probability $p$ of High Risk using:

p = \frac{1}{1 + e^{- (β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{3} + β_{4} x_{4})}}

where

$x_{1}$ = Avg_Delay_Days
$x_{2}$ = Delivery_Reliability
$x_{3}$ = Geopolitical_Risk_Score
$x_{4}$ = Inventory_Level
$β_{i}$ are model coefficients learned during training

Step 5: Evaluating Model Performance with Results Table

Metric	Value
Accuracy	0.85
Precision	0.80
Recall	0.75
F1-Score	0.77

Confusion Matrix:

	Predicted Low Risk	Predicted High Risk
Actual Low Risk	40	5
Actual High Risk	7	18

Step 6: Visualization - ROC Curve


python
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc

y_probs = model.predict_proba(X_test)[:, 1]
fpr, tpr, thresholds = roc_curve(y_test, y_probs)
roc_auc = auc(fpr, tpr)

plt.figure()
plt.plot(fpr, tpr, color='blue', lw=2, label=f'ROC curve (area = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='grey', lw=1, linestyle='--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic - Supply Chain Risk Model')
plt.legend(loc='lower right')
plt.show()

Interpretation:

The ROC curve shows the trade-off between sensitivity (True Positive Rate) and specificity (1 - False Positive Rate).
An AUC (Area Under the Curve) of 0.85 indicates strong predictive performance.
The model effectively distinguishes between high and low-risk supply chain nodes.

Step 7: Interpreting Model Coefficients

Suppose the logistic regression coefficients are:

Feature	Coefficient ( $β$ )
Intercept ( $β_{0}$ )	-4.0
Avg_Delay_Days	0.3
Delivery_Reliability	-0.05
Geopolitical_Risk_Score	0.8
Inventory_Level	-0.01

Interpretation:

Higher average delay and geopolitical risk increase the probability of high risk.
Higher delivery reliability and inventory levels decrease risk.
These insights help prioritize mitigation strategies focusing on delay reduction and geopolitical contingencies.

Step 8: Implementing Risk-aware Logistics Optimization

Using risk predictions:

Route shipments avoiding high-risk suppliers or regions.
Adjust inventory buffers dynamically in risky supply nodes.
Schedule audits or supplier development programs based on risk scores.

Conclusion

This case study demonstrates how logistic regression with Python can effectively perform supply chain risk analysis in engineering. The structured approach from data preprocessing to predictive modeling and interpretation provides actionable insights to optimize logistics, reduce disruptions, and strengthen supply chain resilience.

Dr Umesh Kumar Pandey

Search This Blog