Machine Learning
Homepage / Notes / Computer Science / Artificial Intelligence / Machine Learning
Artificial Neural Networks
Deep Learning
Large Language Models
- Next word prediction
Resources
Intro to Large Language Models - Andrej Karpathy
https://www.youtube.com/watch?v=zjkBMFhNj_g
Happy New Year: GPT in 500 lines of SQL
https://explainextended.com/2023/12/31/happy-new-year-15/
Quantization
Process of reducing the number of bits (weights and biases) of a model/neural network. The primary goal is to compress the model in size for faster execution/computation without sacrificing too much in terms of accuracy.
Example: FP32 ("full precision"), FP16 ("half precision")
The more bits we use to represent a value, the more precise it generally is. The more bits we have available, the larger the range of values that can be represented.
Resources
A Visual Guide to Quantization
https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization
LoRA (Low-Rank Adaptation)
Efficient training method by creating and updating low-rank approximations of the original weight matrices (update matrices).
Resources
MIT 6.S191: Introduction to Deep Learning
http://introtodeeplearning.com/
NYU DS-GA 1008: Deep Learning
https://atcold.github.io/pytorch-Deep-Learning/
UC Berkeley Full Stack Deep Learning
UC Berkeley CS182: Designing, Visualizing and Understanding Deep Neural Networks
https://www2.eecs.berkeley.edu/Courses/CS182/
Deep Learning Book by MIT Press
https://www.deeplearningbook.org/
Cornell CS5787: Deep Learning
Introduction to Deep Learning
https://sebastianraschka.com/blog/2021/dl-course.html
Physics-based Deep Learning Book
https://physicsbaseddeeplearning.org/intro.html
Deep Learning with Python
by François Chollet
Practical Deep Learning
A free course designed for people with some coding experience, who want to learn how to apply deep learning and machine learning to practical problems.
Building LLMs from the Ground Up: A 3-hour Coding Workshop
https://www.youtube.com/watch?v=quh7z1q7-uc
Gradient descent
A method of mathematical optimization. Algorithm to find the local minimum of a loss function.
Resources
Neural Networks: Zero to Hero
A course by Andrej Karpathy on building neural networks, from scratch, in code. We start with the basics of backpropagation and build up to modern deep neural networks, like GPT. In my opinion language models are an excellent place to learn deep learning, even if your intention is to eventually go to other areas like computer vision because most of what you learn will be immediately transferable. This is why we dive into and focus on languade models. Prerequisites: solid programming (Python), intro-level math (e.g. derivative, gaussian).
https://karpathy.ai/zero-to-hero.html
Tools
Weights & Biases (W&B)
The AI Developer Platform Weights & Biases helps AI developers build better models faster. Quickly track experiments, version and iterate on datasets, evaluate model performance, reproduce models, and manage your ML workflows end-to-end.
Resources
Cornell CS5785: Applied Machine Learning
https://cornelltech.github.io/cs5785-fall-2018/index.html
Hands-On Machine Learning with Scikit-Learn and Tensorflow
by Aurélien Géron