YOLO Series

You Only Look Once — Real-time object detection architecture evolution and study notes.

YOLO is a family of single-stage object detection models that process images in a single forward pass, making them extremely fast compared to two-stage detectors.

Evolution Timeline

Version	Year	Key Innovation	Study Note
YOLOv1	2016	First single-stage detector	📄 Note
YOLOv2	2017	Batch normalization, anchor boxes	📄 Note
YOLOv3	2018	Feature pyramid networks	📄 Note
YOLOv4	2020	CSPDarknet, mosaic augmentation	📄 Note
YOLOv5	2020	PyTorch implementation	📄 Note
YOLOv6	2022	Industrial-focused optimizations	📄 Note
YOLOv7	2022	E-ELAN architecture	📄 Note
YOLOv8	2023	Unified framework	📄 Note
YOLOv9	2024	Programmable Gradient Information	📄 Note

Foundation

Before diving into YOLO, it is recommended to review the CNN Fundamentals to understand basic convolutional layers, pooling, and activation functions.

Key Concepts

Single-Stage vs. Two-Stage

Unlike R-CNN series which first propose regions and then classify them (two-stage), YOLO frames object detection as a single regression problem, straight from image pixels to bounding box coordinates and class probabilities.

Grid Cell Strategy

YOLO divides the input image into an $S \times S$ grid. If the center of an object falls into a grid cell, that grid cell is responsible for detecting that object.

Evolution Timeline​

Key Concepts​

Single-Stage vs. Two-Stage​

Grid Cell Strategy​

Reference Papers​

Evolution Timeline

Key Concepts

Single-Stage vs. Two-Stage

Grid Cell Strategy

Reference Papers