<aside>
📢 TLDR: This page is resources of youtube, research papers, books, blogs, discord, tools, glossary, friends, and more.
</aside>
📝 Blog
Untitled Database
đź’ Overall thoughts
Over the last couple of months I’ve been doing some cool stuff. Personally I think, this it the most most interesting and important work I’ve ever done. I was talking to a couple of friends in my study group and it dawned on me that I should compile my notes I’ve been taking into a single living document that I can update overtime as I build and learn new things.
I’ve been working as a software engineer for ~5 years but there’s always been something interesting about machine learning. I don’t think there’s another field that I know of where implications at the lowest level have such a huge implication on how society could be shaped.
I’ve tried to reference as much material as possible, If I’ve missed any references, made any mistakes, or if you want to get in touch add me:
Unlike my diet, everything on this page is 100% organic. Anything generated with AI will be indicated 🦾
🛠️ Projects
These are the projects I’ve been working on:
- llm.go: a port of karpathy’s llm.c. This was a fun, and difficult project that took a couple of weeks cause it was pretty unfamiliar. I still need to fix up the repo and do some stuff (I’m actually presenting this at AI tinkerers on June 27, 2024 ) but it was very fulfilling and ever since this project a lot more things make sense, and it’s nice to have a concrete code example to point back to when I’m explaining or when I’ve got a question about something.
- HumanEval-evy: This is an evy translation of the original HumanEval OpenAI evaluation set. Evy is a programming language my close friend Julia started with the goal of being a beginner friendly programming language. She also helped translate (with the very poor help of my very incomplete AST convertor) the 163 code examples into evy from python which was instrumental in having a first working version of an LLM that could translate from python to evy.
- Above was able to make evy-dataset possible: A dataset of evy code which has been a pain to cold start from, which now is less of a problem, but there are only ~1000 code examples. After this I found out about ‣ from andrei which is grammar constrained generation which I am yet to try.
- All of this has been able to produce some fine tuned gemini models and some fine tuned gpt-2 models (not very good) that can generate evy code. The final goal is to have it as part of the evy playground or even as a code toolchain feature to do really simple syntax correction. I still need to incorporate all of this into an evy feature and package it up. Stay tuned for details.
- LLama.cpp: I tried to implement Apple’s OpenElm architecture back at the end of April/start of May. I think I got a decent way but I ran out of steam because I think my lack of understanding of tensor operations; the reference implementation in pytorch uses row column based tensor storage, split operations aren’t available in ggml (the library that llama.cpp uses under the hood). The tensor sums seemed to match the reference implementation up until the first attention block and then it wildly diverged, meaning garbage ended up being output at the end of the 12th layer.
📺 Youtube channels
This is a list of all the youtube videos which have contributed to my current interest in machine learning. Some of them go back years like CGP Grey, Computerphile, Robert Miles and 3Blue1Brown whereas the rest are ~1 year but these are channels I keep coming back to over and over again because they’re extremely interesting and don’t feel like a class.