Adding L2 Regularization in PyTorch

Learn how to implement L2 regularization in your PyTorch models and improve their generalizability. …

Updated June 22, 2023

Learn how to implement L2 regularization in your PyTorch models and improve their generalizability.

Definition of the Concept

Regularization is a technique used in machine learning to prevent overfitting by adding a penalty term to the loss function. The two most commonly used types of regularization are L1 (Lasso) and L2 (Ridge). In this article, we’ll focus on L2 regularization.

What is L2 Regularization?

L2 regularization, also known as Ridge regression, adds a term to the loss function that’s proportional to the sum of the squares of the model’s weights. This term discourages large weight values and encourages smaller ones, leading to more generalizable models.

Step-by-Step Explanation

To add L2 regularization in PyTorch, follow these steps:

1. Import Necessary Libraries

First, make sure you have PyTorch installed (pip install torch). Then, import the necessary libraries:

import torch
import torch.nn as nn

2. Define Your Model

Next, define your model architecture using PyTorch’s nn.Module class:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(784, 128)  # input layer (28x28 images) -> hidden layer (128 units)
        self.fc2 = nn.Linear(128, 10)   # hidden layer (128 units) -> output layer (10 classes)

    def forward(self, x):
        x = torch.relu(self.fc1(x))       # activation function for hidden layer
        x = self.fc2(x)
        return x

3. Add L2 Regularization

Now, add the L2 regularization term to your model’s loss function:

# Create a model instance
model = MyModel()

# Define the loss function with L2 regularization
criterion = nn.CrossEntropyLoss()
l2_lambda = 0.01  # hyperparameter for L2 regularization strength

def custom_loss_fn(model, inputs, labels):
    outputs = model(inputs)
    loss = criterion(outputs, labels) + l2_lambda * (sum(p**2 for p in model.parameters()))
    return loss

In the above code, l2_lambda is a hyperparameter that controls the strength of L2 regularization. A larger value encourages smaller weights and vice versa.

4. Train Your Model

Finally, train your model using your custom loss function:

# Create a dataset instance (e.g., MNIST)
train_dataset = ...

# Create a data loader for training
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32)

# Set up the optimizer and scheduler (if needed)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.7)

# Train the model
for epoch in range(10):
    for batch in train_loader:
        inputs, labels = batch
        loss_fn = custom_loss_fn(model, inputs, labels)
        optimizer.zero_grad()
        loss_fn.backward()
        optimizer.step()

    scheduler.step()

That’s it! By following these steps, you’ve successfully added L2 regularization to your PyTorch model and improved its generalizability.

Tips and Variations:

Experiment with different values of l2_lambda to see how it affects your model’s performance.
Consider using other types of regularization, such as dropout or L1 (Lasso) regularization, depending on your specific problem and dataset.
Use a more advanced optimization algorithm, like AdamW or RMSProp, if you’re experiencing convergence issues.

I hope this tutorial has been helpful in understanding how to add L2 regularization in PyTorch. Happy coding!