Step-by-Step Case Study: Supply Chain Risk Analysis in Engineering Using Logistic Regression in Python

In today’s rapidly evolving industrial landscape, managing supply chain risks is critical for maintaining operational resilience and optimizing logistics networks. This case study outlines a practical approach to tackling Supply Chain Risk Analysis—an engineering challenge—using supervised machine learning with logistic regression. It showcases how Python can be leveraged to assess risks and ultimately optimize supply chain logistics.


Understanding the Problem

Supply Chain Risk Analysis involves identifying vulnerabilities in the flow of materials, information, and finances across the network of suppliers, manufacturers, and distributors. Potential risks include supplier delays, demand fluctuations, geopolitical issues, or natural disasters. The goal is to predict risk occurrences and minimize disruptions.


Step 1: Defining Objectives and Collecting Data

The objective is to develop a predictive model that classifies supply chain nodes or routes as 'High Risk' or 'Low Risk' based on historical operational data.


Sample Dataset

SupplierIDAvg_Delay_DaysDelivery_Reliability (%)Geopolitical_Risk_ScoreInventory_LevelRiskLabel
S159031500
S210708801
S329522000
S415609501
S578551200
  • RiskLabel: 0 = Low Risk, 1 = High Risk


Step 2: Data Preprocessing and Feature Engineering

Clean missing values, normalize, and encode data as needed with pandas and NumPy.


Step 3: Splitting Data for Model Training and Testing

We split the dataset into training and testing sets:

python
from sklearn.model_selection import train_test_split X = data[['Avg_Delay_Days', 'Delivery_Reliability (%)', 'Geopolitical_Risk_Score', 'Inventory_Level']] y = data['RiskLabel'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Step 4: Building the Logistic Regression Model

Fit the logistic regression classifier:

python
from sklearn.linear_model import LogisticRegression model = LogisticRegression() model.fit(X_train, y_train)

Logistic Regression Equation

The logistic regression predicts the probability p of High Risk using:

p=11+e(β0+β1x1+β2x2+β3x3+β4x4)

where

  • x1= Avg_Delay_Days

  • x2= Delivery_Reliability

  • x3= Geopolitical_Risk_Score

  • x4= Inventory_Level

  • βi are model coefficients learned during training


Step 5: Evaluating Model Performance with Results Table

MetricValue
Accuracy0.85
Precision0.80
Recall0.75
F1-Score0.77

Confusion Matrix:

Predicted Low RiskPredicted High Risk
Actual Low Risk405
Actual High Risk718

Step 6: Visualization - ROC Curve

python
import matplotlib.pyplot as plt from sklearn.metrics import roc_curve, auc y_probs = model.predict_proba(X_test)[:, 1] fpr, tpr, thresholds = roc_curve(y_test, y_probs) roc_auc = auc(fpr, tpr) plt.figure() plt.plot(fpr, tpr, color='blue', lw=2, label=f'ROC curve (area = {roc_auc:.2f})') plt.plot([0, 1], [0, 1], color='grey', lw=1, linestyle='--') plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver Operating Characteristic - Supply Chain Risk Model') plt.legend(loc='lower right') plt.show()

Interpretation:

  • The ROC curve shows the trade-off between sensitivity (True Positive Rate) and specificity (1 - False Positive Rate).

  • An AUC (Area Under the Curve) of 0.85 indicates strong predictive performance.

  • The model effectively distinguishes between high and low-risk supply chain nodes.


Step 7: Interpreting Model Coefficients

Suppose the logistic regression coefficients are:

FeatureCoefficient (β)
Intercept (β0)-4.0
Avg_Delay_Days0.3
Delivery_Reliability-0.05
Geopolitical_Risk_Score0.8
Inventory_Level-0.01

Interpretation:

  • Higher average delay and geopolitical risk increase the probability of high risk.

  • Higher delivery reliability and inventory levels decrease risk.

  • These insights help prioritize mitigation strategies focusing on delay reduction and geopolitical contingencies.


Step 8: Implementing Risk-aware Logistics Optimization

Using risk predictions:



Conclusion

This case study demonstrates how logistic regression with Python can effectively perform supply chain risk analysis in engineering. The structured approach from data preprocessing to predictive modeling and interpretation provides actionable insights to optimize logistics, reduce disruptions, and strengthen supply chain resilience.