Reading

books and papers, current and past

currently reading (2)

  • Scaling Laws for Neural Language Models — Kaplan et al. Re-reading carefully alongside the Chinchilla paper.
  • Software Engineering at Google — Titus Winters et al. Slow read. Skipping the chapters that aren't relevant; lingering on the ones that are.

finished (2)

  • The Unreasonable Effectiveness of Data — Halevy, Norvig, Pereira Old but somehow still right. Worth a re-read every couple of years.
  • Training Compute-Optimal Large Language Models (Chinchilla) — Hoffmann et al. Paired well with Kaplan. Worth memorizing the loss-vs-compute curves.

queued (1)

  • Deep Learning with PyTorch — Eli Stevens, Luca Antiga, Thomas Viehmann Lined up after Chinchilla.