Add Momentum SGD optimizer implementation #13680

Adhithya-Laxman · 2025-10-22T08:43:45Z

Description

This PR implements the Momentum SGD optimizer using pure NumPy as part of the effort to add neural_network/optimizers to the repository.

This PR addresses part of issue #13662 .

What does this PR do?

Implements Momentum SGD optimizer that accelerates gradient descent by adding momentum to weight updates
Uses velocity accumulation to dampen oscillations and speed up convergence
Provides a clean, educational implementation without external deep learning frameworks

Implementation Details

Algorithm: SGD with momentum

Update rule:

velocity = momentum * velocity - learning_rate * gradient
param = param + velocity

Pure NumPy: No PyTorch, TensorFlow, or other frameworks required
Educational focus: Clear variable names, detailed docstrings, and comments

Features

✅ Complete docstrings with parameter descriptions
✅ Type hints for all function parameters and return values
✅ Doctests for correctness validation
✅ Usage example demonstrating optimizer on quadratic function
✅ PEP8 compliant code formatting
✅ Momentum accumulation with configurable momentum factor

Testing

All doctests pass:

python -m doctest neural_network/optimizers/momentum_sgd.py -v

Linting passes:

ruff check neural_network/optimizers/momentum_sgd.py

Example output demonstrates proper convergence behavior.

This PR is the second optimizer in the planned sequence outlined in #13662:

✅ Stochastic Gradient Descent (SGD) - feat: add SGD optimizer for neural networks #13671
✅ Momentum SGD (this PR)
⏳ Nesterov Accelerated Gradient (NAG) - upcoming
⏳ Adagrad - upcoming
⏳ Adam - upcoming
⏳ Muon - upcoming

References

Checklist

Next Steps

Additional optimizers (Adam, Adagrad, NAG, Muon) will be submitted in follow-up PRs to maintain focused, reviewable contributions.

- Implements SGD with momentum using pure NumPy - Includes comprehensive docstrings and type hints - Adds doctests for validation - Provides usage example demonstrating convergence - Follows PEP8 coding standards

Add Momentum SGD optimizer implementation

a662792

- Implements SGD with momentum using pure NumPy - Includes comprehensive docstrings and type hints - Adds doctests for validation - Provides usage example demonstrating convergence - Follows PEP8 coding standards

algorithms-keeper bot added the awaiting reviews This PR is ready to be reviewed label Oct 22, 2025

This was referenced Oct 23, 2025

Add muon optimizer #13719

Closed

Add muon optimizer #13720

Closed

Add Muon optimizer #13721

Closed

Add Muon optimizer implementation #13724

Closed

Added Nesterov and Adam Optimizers #13718

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add Momentum SGD optimizer implementation #13680

Add Momentum SGD optimizer implementation #13680

Uh oh!

Adhithya-Laxman commented Oct 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Add Momentum SGD optimizer implementation #13680

Are you sure you want to change the base?

Add Momentum SGD optimizer implementation #13680

Uh oh!

Conversation

Adhithya-Laxman commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What does this PR do?

Implementation Details

Features

Testing

References

Checklist

Next Steps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Adhithya-Laxman commented Oct 22, 2025 •

edited

Loading