Still Image of It System Engineer Coding Interface On Multiple Monitors To Program New Server System Late At Night Working After Hours To Develop Source Code On Terminal Window Ai Html Language

How do LLMs work?

“Artificial Intelligence is a term for a technology that does not work. As soon as it starts working, you give it a new name.”
— Richard Campbell, at NDC Oslo in 2021, in a talk titled The Next Decade of Software Development.

More than one person has asked me the title question, so here is a blanket set of resources that may help you find an answer that is useful to you.

Reading

Academic papers is where all this starts, since it is more computer science than information technology:

“The perceptron: A probabilistic model for information storage and organization in the brain”, Rosenblatt, F. (1958) is the seminal paper for all “machine thinking” and is where to start if you want to board the train at the first station.
https://github.com/aimerou/awesome-ai-papers?tab=readme-ov-file#history

“Attention Is All You Need”, Vaswani et al. (2017) is the MUST read. Without the ideas in this paper, we don’t get anything being called AI today.
https://github.com/papers-we-love/papers-we-love/tree/main/artificial_intelligence

That paper came about because of Machine Learning, which is what we were calling AI before we got it to work. The three ML papers that I consider essential:

  • “Distilling the Knowledge in a Neural Network”, Hinton, Vinyals, and Dean (2015)
  • “Support-Vector Networks”, Corinna Cortes and Vladimir Vapnik (1995)
  • “Random Forests”, Leo Breiman (2001)

https://github.com/papers-we-love/papers-we-love/tree/main/machine_learning

Here are about 300 bookmarks about LLMs I’ve made dating back to 2022 — not really more than a bucket, but curation relative to the whole Web.
https://raindrop.io/katachora/llms-64993031

Here are two bloggers that I enjoy who are trying to understand and advance AI without working for a place with an endless money fountain.
https://timkellogg.me/blog/
https://lethain.com/

Watching

Fundamental Computer Theory Talks

I think that it’s important to put all this into some sort of perspective, and these three videos of talks given by people who created programming languages (Smalltalk, Erlang, and Clojure, respectively) which are objectively better than what is used for industrialized software manufacturing create as good of a lens as any. Kay’s comments about the number of things in the system, Armstrong’s comments about the way that the system is put together, and Hickey’s comments about complexity are each an indictment of the application of the science in the papers above.

Alan Kay’s “The Computer Revolution Hasn’t Happened yet”

Joe Armstrong’s “The Mess We’re In”

Rich Hickey’s “Simple Made Easy”

YouTube Channels that spend a lot of time on explaining LLMs/ML/GPT

These are entire channels that mostly just explain complicated things in ways that are generally easy to understand, and who have specifically spent a lot of time on AI in 2025

Alberta Tech
Welsh Labs
3 Blue 1 Brown

Explainers on Transformers and Generic Pre-trained Transformers (GPT)

These are very good explainers specifically about transformers.

https://youtu.be/SZorAJ4I-sA?si=ZdYwqEePJL67CbMz

https://youtu.be/ZXiruGOCn9s?si=D7XE-zPqSUr4twVL

https://youtu.be/bdICz_sBI34?si=jB5LlNnF1caN5n1r

https://youtu.be/nZrZOI0oRuw?si=piYgev-RqGCfECUf


Posted

in

by

Tags: