Ben ManninTowards Data ScienceHow to sample from language modelsCausal language models like GPT-2 are trained to predict the probability of the next word given some context. For example, given “I ate a…6 min read·May 24, 2019--3
Ben ManninTowards Data ScienceAn intuitive understanding of the LAMB optimizerIn software engineering, decreasing cycle time has a super-linear effect on progress. In modern deep learning, cycle time is often on the…9 min read·May 1, 2019--11
Ben Mann5 tips to poop like a championThe perfect poop is one large snake that comes out clean as soon as I sit down. It should happen around the same time every day. There…3 min read·Aug 1, 2017--2
Ben ManninTowards Data ScienceScaling Transformer-XL to 128 GPUsBen Mann, Yaroslav Bulatov, Darius Lam12 min read·May 9, 2019--1
Unbecoming10 Seconds That Ended My 20 Year MarriageIt’s August in Northern Virginia, hot and humid. I still haven’t showered from my morning trail run. I’m wearing my stay-at-home mom…·4 min read·Feb 16, 2022--946
AL AnanyThe ChatGPT Hype Is Over — Now Watch How Google Will Kill ChatGPT.It never happens instantly. The business game is longer than you know.·6 min read·Sep 1--296
Shawhin TalebiinTowards Data ScienceHow to Build an LLM from ScratchData Curation, Transformers, Training at Scale, and Model Evaluation·16 min read·5 days ago--7
Scott-Ryan AbtinPitfallBye Bye, SpotifyAnd see ya later, all you subscription services in my little empire·4 min read·Aug 19--249
Dmitry KruglovinBetter ProgrammingThe Architecture of a Modern StartupHype wave, pragmatic evidence vs the need to move fast16 min read·Nov 7, 2022--53
Nick HiltonThe End of the Subscription Era is ComingYou’re overpaying for your porn (and journalism)10 min read·Aug 30--205