Computer Vision

👁️

YOLOv8

Self-host YOLOv8, a state-of-the-art real-time object detection and image segmentation model for computer vision applications

Intermediate open-source self-hosted object-detection segmentation image-recognition

GitHub Repository Official Website

Alternative To

• Google Cloud Vision
• AWS Rekognition
• Azure Computer Vision

Difficulty Level

Intermediate

Requires some technical experience. Moderate setup complexity.

Overview

YOLOv8 is a state-of-the-art, real-time object detection and image segmentation model developed by Ultralytics. It builds on the success of previous YOLO versions with significant improvements in accuracy and speed. YOLOv8 supports a wide range of computer vision tasks, including object detection, instance segmentation, pose estimation, tracking, and classification.

System Requirements

CPU: 4+ cores
RAM: 8GB+
GPU:
- Optional for inference but highly recommended
- NVIDIA GPU with 4GB+ VRAM (8GB+ recommended for larger models)
- CUDA support
Storage: 5GB+ for the framework and models

Installation Guide

Prerequisites

Python 3.8 or higher
pip package manager
NVIDIA drivers and CUDA toolkit if using GPU acceleration

Standard Installation

The easiest way to install YOLOv8 is using pip:

pip install ultralytics

This installs the Ultralytics package which includes YOLOv8.

Verification

Verify the installation by running a simple detection:

# Python
python -c "from ultralytics import YOLO; YOLO('yolov8n.pt')('https://ultralytics.com/images/bus.jpg')"

# CLI
yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'

Practical Exercise: Object Detection with YOLOv8

Let’s create a complete object detection pipeline using YOLOv8.

Step 1: Create a Basic Detection Script

Create a file named detect.py with the following code:

from ultralytics import YOLO
import cv2

# Load a pretrained YOLOv8 model
model = YOLO('yolov8n.pt')  # 'n' for nano size (also 's', 'm', 'l', 'x' available)

# Source can be an image, video, or directory containing images/videos
source = 'path/to/your/image_or_video.jpg'  # Replace with your file

# Run inference
results = model(source)

# Process results
for result in results:
    # Get the original image
    orig_img = result.orig_img

    # Get detection boxes, confidence scores, and class IDs
    boxes = result.boxes.xyxy.cpu().numpy()
    confidences = result.boxes.conf.cpu().numpy()
    class_ids = result.boxes.cls.cpu().numpy().astype(int)

    # Draw the detections on the image
    for box, confidence, class_id in zip(boxes, confidences, class_ids):
        x1, y1, x2, y2 = box.astype(int)
        class_name = model.names[class_id]

        # Draw bounding box
        cv2.rectangle(orig_img, (x1, y1), (x2, y2), (0, 255, 0), 2)

        # Add label
        label = f"{class_name}: {confidence:.2f}"
        cv2.putText(orig_img, label, (x1, y1 - 10),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Display the result (for images)
    if not isinstance(source, str) or not (source.endswith('.mp4') or source.endswith('.avi')):
        cv2.imshow("YOLOv8 Detection", orig_img)
        cv2.waitKey(0)
        cv2.destroyAllWindows()

    # Save the result
    cv2.imwrite('result.jpg', orig_img)

Step 2: Real-time Webcam Detection

Create a script for real-time detection from a webcam:

from ultralytics import YOLO
import cv2

# Load the model
model = YOLO('yolov8n.pt')

# Open the webcam
cap = cv2.VideoCapture(0)

# Set the resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

while cap.isOpened():
    # Read a frame from the webcam
    success, frame = cap.read()

    if success:
        # Run YOLOv8 inference on the frame
        results = model(frame)

        # Visualize the results on the frame
        annotated_frame = results[0].plot()

        # Display the annotated frame
        cv2.imshow("YOLOv8 Webcam", annotated_frame)

        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        # Break the loop if the webcam is disconnected
        break

# Release the webcam and close the window
cap.release()
cv2.destroyAllWindows()

Step 3: Training on Custom Data

To train YOLOv8 on your own dataset:

Prepare your dataset in YOLOv8 format:
- A folder containing train, val (and optionally test) subdirectories
- Each subdirectory containing images and labels folders
- Labels in YOLO format (class_id, x_center, y_center, width, height)
- A YAML file describing your dataset structure and classes
Create a YAML file (e.g., custom_dataset.yaml) with the following content:

# Dataset configuration
path: /path/to/dataset # dataset root directory
train: train/images # train images relative to 'path'
val: val/images # val images relative to 'path'
test: # test images (optional)

# Class names
names:
  0: class1
  1: class2
  # Add more classes as needed

Start training:

from ultralytics import YOLO

# Load a pretrained YOLOv8 model
model = YOLO('yolov8n.pt')

# Train the model using your custom dataset
results = model.train(
    data='custom_dataset.yaml',
    epochs=100,
    imgsz=640,
    batch=16,
    name='custom_model'
)

# Test the model
model.val()  # Validate on the validation set

# Export the model to ONNX format for deployment
model.export(format='onnx')

Step 4: Model Deployment

After training, deploy your model:

Access your trained model at runs/detect/custom_model/weights/best.pt
Use it for inference just like the pretrained models
Export to formats like ONNX, TensorRT, or CoreML for deployment on various platforms

Resources

Official Documentation

Comprehensive documentation for all YOLOv8 functionality:

Ultralytics Docs

GitHub Repository

Source code and examples:

Ultralytics GitHub

Community Support

Get help and share experiences:

GitHub Issues

Tutorials and Guides

Learn more with tutorials:

Ultralytics Docs Tutorials

Roboflow YOLOv8 Tutorials

Model Zoo

Explore pretrained models:

YOLOv8 Models

Suggested Projects

You might also be interested in these similar projects:

👁️

Supervision

Computer Vision

Self-host Supervision, a Python library with reusable computer vision tools for easy annotation, detection, tracking, and dataset management

GitHub

Website

Difficulty: Beginner

Updated: Mar 1, 2025

🤖

CrewAI

Agentic Frameworks

CrewAI is a standalone Python framework for orchestrating role-playing, autonomous AI agents that collaborate intelligently to tackle complex tasks through defined roles, tools, and workflows.

GitHub

Website

Difficulty: Intermediate

Updated: Mar 23, 2025

🔌

ModelContextProtocol

AI Integration

An open protocol that connects AI models to data sources and tools with a standardized interface

GitHub

Website

Difficulty: Intermediate

Updated: Mar 23, 2025