2024 11 llm advanced learn

LLM - Advanced learning

Q: What comes next after you've written GPT-2 from scratch? What is interesting to read about to keep your motivation high?

Bonus material from Sebastian's book. Specifically going line by line and understanding how to convert GPT-2 to Llama 3.2: https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05%2F07_gpt_to_llama

Learn RoPE, intuition around RMS, SilU, grouped-query to modernize your understanding of LLMs:

/ This piece really unlocked my mind to BF16 - https://magazine.sebastianraschka.com/p/the-missing-bits-llama-2-weights pretty basic but good to build intuition rather than just use it to solve your nan loss problem because someone told you to.

4/ I wanted a better intuition around how Flash Attention worked behind the scenes:

https://www.youtube.com/embed/gBMO1JZav44 Great video! Tiling + SRAM!

5/ Curious how diffusion models can be faster? This is most intuitive post on flow matching made by a rando on the Internet you'd never usually find:

https://tommyc.xyz/posts/simple-diffusion

6/ Best post on RoPE:

https://huggingface.co/blog/designing-positional-encoding Definitely play with the formulas in a notebook to gain a stronger intuition.

7/ If you want to help us make an AI graphic designer, consider joining our team where our mission is to push the boundaries of what computers can do with graphics. Jobs page: playground.com/jobs

Our research team trained this model from scratch: playground.com/pg-v3

Building a great research team is core to what we do to surpass what trad software can do toda

8/ Best tutorial for coding a VLM from scratch:

https://www.youtube.com/embed/vAmKB7iPkWw

I wish it were Pixtral which is a super clean arch but I’ll settle for Paligemma for now.

This is next on my list after conquering RoPE + grouped-query attn on the plane tomorrow.