Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog. — Kunal

Stochastic Gradient Descent

Only update weights by choosing a specific instance of the batch instead of all of them.
- Very noisy
  - but fast
  - much faster than batch gradient for machine learning
- samples have redundancy between them
- Only reason for batching is because hardware is more efficient at batching
  - Parallelized in a simple way, which is best solved by batching