You have been granted the power to predict the future. That’s cool and all, but how erm…wrong… are you?
So let’s say you’re trying to predict how long it takes to get your pizza delivered based on how far away you are from the restaurant. You collect this small, humble dataset:
You, as an intelligent being, decide to build a linear regression model to predict delivery time based on distance. NICE! But now comes the big question:
Statistically speaking, how good are your predictions?
That’s where RSS, MSE and RMSE come in. But before we understand these three, we need to know what residuals are.
Step 1: Understand Residuals
A residual is just:
residual = actual_value — predicted_value
It tells you how off each prediction was.
For example:
If your model says delivery will take 18 mins but it actually took 20, the residual is -2 (model underestimated by 2 mins).
We’ll build on this idea to evaluate how bad (or good) our model is using three popular metrics:
The Metrics
1️⃣ RSS: Residual Sum of Squares
This is…well, the sum of squared residuals obviously, i.e. how much total “oops” your model made.
Formula:
RSS = Σ(actual − predicted)²
If it isn’t obvious at this point, we take the residuals, square them, then take the sum. It’s in the name ❤️
2️⃣ MSE: Mean Squared Error
The average squared error. With RSS, the larger the dataset, the bigger the value. To level the playing field, just divide RSS by the number of data points. EQUALITY✨ RIGHT?
MSE = RSS / n = (1/n) * Σ(actual − predicted)²
3️⃣ RMSE: Root Mean Squared Error
This is the square root of MSE. It puts the error back in the original unit (minutes, in this case). Much easier to interpret.
NOTE: Since we squared all the residuals in RSS and MSE, the units also get squared. This is why I said, RMSE puts the error back in the original unit.
RSS & MSE:
Okay, enough comedy, let’s do this in code:
import numpy as np
import matplotlib.pyplot as plt
# Distance in km
X = np.array([1, 2, 3, 4, 5])
# Actual delivery times in minutes
y_actual = np.array([15, 17, 20, 22, 26])
# Let’s say your model predicted:
y_predicted = np.array([14, 16, 19, 23, 25]) # a bit off, but not terrible
# Calculate residuals
residuals = y_actual — y_predicted
# RSS
RSS = np.sum(residuals ** 2)
# MSE
MSE = RSS / len(y_actual)
# RMSE
RMSE = np.sqrt(MSE)
print(f”Residuals: {residuals}”)
print(f”RSS: {RSS}”)
print(f”MSE: {MSE}”)
print(f”RMSE: {RMSE:.2f}”)
Output:
Residuals: [ 1 1 1 -1 1]
RSS: 5
MSE: 1.0
RMSE: 1.00
Interpretation:
RSS = 5 → Total squared error across all predictions.
MSE = 1.0 → On average, the model is off by 1 squared minute.
RMSE = 1.00 → On average, predictions are off by about 1 minute. Clean and easy to interpret.
So, When Do You Use What?
TL;DR
1. Residuals are how off each prediction is (actual-prediction)
2. RSS is the total squared error (sum of squares of residuals)
3. MSE is the average squared error. (RSS/no. of rows)
4. RMSE is your go-to, sane metric for “how far off are we, on average?” (square root of the MSE)
Final Thoughts
Metrics like RMSE help us quantify the quality of our models. Evaluating your models will save you time, money and dignity 🙂 So before you start flexing your model, why don’t you try testing it first?
Next up, we can dive into:
- What happens if your errors aren’t normally distributed?
- MAE vs RMSE
- Cross-validation
- Regularized regression
Stay tuned for good vibes, good lessons and ‘GOOD’ MEMES.