Machine Learning

Homepage / Notes / Computer Science / Artificial Intelligence / Machine Learning

Artificial Neural Networks

Deep Learning

Large Language Models

Next word prediction

Resources

Intro to Large Language Models - Andrej Karpathy

https://www.youtube.com/watch?v=zjkBMFhNj_g

Happy New Year: GPT in 500 lines of SQL

https://explainextended.com/2023/12/31/happy-new-year-15/

Quantization

Process of reducing the number of bits (weights and biases) of a model/neural network. The primary goal is to compress the model in size for faster execution/computation without sacrificing too much in terms of accuracy.

Example: FP32 ("full precision"), FP16 ("half precision")

The more bits we use to represent a value, the more precise it generally is. The more bits we have available, the larger the range of values that can be represented.

Resources

A Visual Guide to Quantization

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization

LoRA (Low-Rank Adaptation)

Efficient training method by creating and updating low-rank approximations of the original weight matrices (update matrices).

Resources

MIT 6.S191: Introduction to Deep Learning

http://introtodeeplearning.com/

NYU DS-GA 1008: Deep Learning

https://atcold.github.io/pytorch-Deep-Learning/

UC Berkeley Full Stack Deep Learning

UC Berkeley CS182: Designing, Visualizing and Understanding Deep Neural Networks

https://www2.eecs.berkeley.edu/Courses/CS182/

Deep Learning Book by MIT Press

https://www.deeplearningbook.org/

Cornell CS5787: Deep Learning

Introduction to Deep Learning

https://sebastianraschka.com/blog/2021/dl-course.html

Physics-based Deep Learning Book

https://physicsbaseddeeplearning.org/intro.html

Deep Learning with Python

by François Chollet

Practical Deep Learning

A free course designed for people with some coding experience, who want to learn how to apply deep learning and machine learning to practical problems.

https://course.fast.ai/

Building LLMs from the Ground Up: A 3-hour Coding Workshop

https://www.youtube.com/watch?v=quh7z1q7-uc

Gradient descent

A method of mathematical optimization. Algorithm to find the local minimum of a loss function.

Resources

Neural Networks: Zero to Hero

A course by Andrej Karpathy on building neural networks, from scratch, in code. We start with the basics of backpropagation and build up to modern deep neural networks, like GPT. In my opinion language models are an excellent place to learn deep learning, even if your intention is to eventually go to other areas like computer vision because most of what you learn will be immediately transferable. This is why we dive into and focus on languade models. Prerequisites: solid programming (Python), intro-level math (e.g. derivative, gaussian).

https://karpathy.ai/zero-to-hero.html

Tools

Weights & Biases (W&B)

https://wandb.ai/site

The AI Developer Platform Weights & Biases helps AI developers build better models faster. Quickly track experiments, version and iterate on datasets, evaluate model performance, reproduce models, and manage your ML workflows end-to-end.

Resources

Cornell CS5785: Applied Machine Learning

https://cornelltech.github.io/cs5785-fall-2018/index.html

Hands-On Machine Learning with Scikit-Learn and Tensorflow

by Aurélien Géron

Machine Learning Engineering Open Book

https://github.com/stas00/ml-engineering