📚 Books/Textbooks/Blogs
Books and textbooks are less common mainly because I believe building stuff is always going to be a more effective way of learning anything compared to doing textbook learning, unless your name is phi, of course.
- Understanding Deep Learning - I am currently making my way through this, and plans to do the Jupyter notebooks in my study group.
- Gödel, Escher, Bach - It seems like a lot from this book isn’t really referenced much today which I find a little strange. Some papers such as the Platonic Representation Hypothesis seem to be touching on the same idea of representation and allegory. One of the reasons why multi-modal models are so successful is because more data across modalities end up creating a more true distribution within the model. I have not made my way through all of this, (~half way through) but I found the MIT lecture series on this to be a really good companion book on it. I know it’s cliche but it did change the way I think about the world, or at least “AI” systems.
- **The Bitter Lesson -** More data and more compute has historically always yielded better results. We see this in scaling laws and we see this in every field. If there’s a benchmark to hit, and data to hit it then deep learning is eventually going to get there. I have many, many thoughts about how everything everywhere then becomes just a calculation between amount of available data, amount of available compute, and amount of expected compute. All of this determines everything like: model architecture, amount of human time spent on a problem, etc (blogpost link when available July 1, 2024).