【跟我学YOLO】YOLO26（4）ROP 分类模型训练

YouCans

291人浏览 · 2026-06-13 08:21:16

YouCans · 2026-06-13 08:21:16 发布

欢迎关注『跟我学 YOLO』系列
【跟我学YOLO】YOLO26（1）YOLO Vision 2025 最新发布的端到端视觉 AI 新突破
 【跟我学YOLO】YOLO26（2）实时目标检测的关键架构改进与性能基准测试
 【跟我学YOLO】YOLO26（3）模型下载、环境配置与目标检测
 【跟我学YOLO】【跟我学YOLO】YOLO26（4）ROP 分类模型训练

YOLO26（4）分类模型训练

Ultralytics 发布了 YOLO 模型系列的最新突破——Ultralytics YOLO26！本文介绍 YOLO26 的下载、配置和检测。YOLO26 可以执行各种计算机视觉任务，包括：

检测（detect）：物体检测识别并定位图像或视频中的物体；
分割（segment）：实例分割将图像或视频划分为对应于不同对象或类别的区域；
分类（classification）：图像分类预测输入图像的类别标签；
姿势估计（pose）：姿势估计识别图像或视频中的对象并估计其关键点；
旋转框检测（OBB）：旋转边界框使用旋转的边界框，适用于卫星或医学图像。

上节介绍了 YOLO13 的模型下载、环境配置与目标检测，本节介绍使用用户自己的数据集训练 YOLO26 分类模型，建立特定任务的私有模型。

1. YOLO26 环境配置与安装

本文对 YOLO26 环境创建和配置只做简要介绍，详细步骤请参见【跟我学YOLO】YOLO26（3）模型下载、环境配置与目标检测。

安装 Anaconda、Pycharm
创建 Python 环境
创建名称为 YOLO26 的 Python 环境，注意推荐Python 版本为3.8。激活 YOLO26 环境。

conda env list
conda create -n YOLO26 python=3.8
conda activate YOLO26

激活虚拟环境

conda activate YOLO26

安装 PyTorch（示例）
先安装 GPU 版 PyTorch（详见：PyTorch GPU版本安装与环境配置），再安装 ultralytics。
示例如下，具体版本要根据用户硬件配置修改（可以访问 [https://pytorch.org/]((https://pytorch.org/) 进行查询）：

conda activate YOLO26
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121

安装 Ultralytics
推荐使用 Pip 安装 YOLO26。Pip 会自动读取 ultralytics 包在 PyPI 上声明的 install_requires 依赖列表，并递归安装所有必需的第三方库。

conda activate YOLO26
pip install -U ultralytics

下载预训练模型权重
YOLO26 有多个不同规模的模型，从小到大依次是：YOLO26N、YOLO26S、YOLO26L、YOLO26X。这些模型与各种操作模式兼容，包括推理、验证、训练和导出，便于在部署和开发的不同阶段使用。

YOLO26 运行时如果在本地没有检测到预训练模型，将会自动从最新的 Ultralytics 版本下载模型，但下载速度可能很慢甚至连接失败（视网络条件和限制），因此推荐先将 YOLO26 预训练模型下载到本地。

wget https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo26m-cls.pt #（示例）

安装测试：
输入 “yolo help” ，系统输出 YOLO26 的帮助提示信息，说明安装成功。

在这里插入图片描述

2. 准备 ROP 分类数据集

YOLO26 项目中提供了不同数据集转换的指南和例程，位于 “.\docs\en\datasets” 路径，例如在文件 coco.md 中介绍了使用 COCO 数据集来训练 YOLOv26 目标检测模型。

2.1 YOLOv13 数据集的格式

1、数据集的组织
YOLOv13 数据集通常包含图像文件和标注文件。图像文件通常是 jpg、png 等图像格式，包含了待检测的目标。标注文件则是包含每张图像中目标对象的类别和位置信息的文本文件。

YOLOv13 默认使用 COCO2017 数据集进行训练，结构如下。

images 目录包含 train、valid 文件夹，这两个文件夹下包含模型训练所需要的图片文件；
labels 目录包含 train、valid 文件夹，这两个文件夹下包含模型训练图片所对应的标注文件；
images 目录可以包含 test 文件夹，其中包含测试所用的图片文件。

dataset/
├── images/
│   ├── train/
│   └── val/
│   └── test/  # (option)
└── labels/
    ├── train/
    └── val/

2、标注文件的格式

YOLO格式的标注文件通常是一个文本文件，每一行代表一个目标物体的标注信息。
标注信息通常包含类别编号、目标中心横坐标（相对于图像宽度）、目标中心纵坐标（相对于图像高度）、目标宽度（相对于图像宽度）和目标高度（相对于图像高度），具体格式如下：

<object-class-id> <x> <y> <width> <height>

其中：

例如，某一行的标注信息为 “2 0.3 0.6 0.25 0.40”，其中“2”表示类别编号，后面的 4个数字表示目标在图像中的位置 <x> <y> 和宽高 <width> <height>。

3、XML 标注文件的格式转换

对于 Pascal VOC数据集，常用XML格式的标注文件，需要将文件夹下的所有类别的 xml 格式的标注转换成为yolo格式。转换后的标签要保存在 labels文件夹文件夹下。

2.2 下载 YOLO 数据集

打开 Roboflow 网站，从Roboflow 公开数据集中选择一个数据集，下载到本地。

下载的 Aquarium 数据集设有 test，train，valid 三个文件夹，分别用作测试、训练和检验。每个文件夹下设有 images，labels 两个文件夹，分别保存图像文件和标注文件。

一个典型的标注文件的内容如下。文件有 4 行，每行表示一个检测目标。每行有 5个参数，第 1 列是类别标签，后 4个参数是 BoundingBox 的坐标位置。

3 0.5 0.5361328125 0.08854166666666667 0.1142578125
3 0.30859375 0.3115234375 0.09244791666666667 0.103515625
3 0.71875 0.5859375 0.15104166666666666 0.0888671875
3 0.3072916666666667 0.494140625 0.10807291666666667 0.0693359375

使用 Diabetic Retinopathy 数据集进行模型训练的 Python 参考例程如下。

from ultralytics import YOLO

if __name__ == '__main__':
    # 创建 YOLO13 模型对象，加载指定的模型配置
    model = YOLO('yolov13.yaml')

    # 加载预训练的 YOLO13 权重文件
    model.load('yolov13n.pt')

    # 用指定数据集训练模型
    results = model.train(data=r'dataAquariumYolo.yaml',  # 指定训练数据集的配置文件路径
                          cache=False,  # 是否缓存数据集以加快后续训练速度
                          workers=4,  # 设置用于数据加载的线程数为4
                          device='0',  # 运行设备, 指定使用的 CPU/GPU 设备
                          epochs=100,  # 设置训练的总轮数为100轮
                          batch=64,  # 设置每个训练批次的大小为16
                          imgsz=640,  # 指定训练时使用的图像尺寸
                          scale=0.5,  # S:0.9; L:0.9; X:0.9
                          mixup=0.0,  # S:0.05; L:0.15; X:0.2
                          freeze=10,  # 冻结模型前10层（backbone部分）
                          optimizer = 'SGD',  # 设置优化器为SGD（随机梯度下降）
                          )

注意：
（1）本例程使用的 YOLOv13 项目的路径为 “C:\Python\Projects\YOLOv13” 。
（2）本例程使用的训练数据集配置文件路径为 “C:\Python\Projects\YOLOv13\dataAnimalYolo12.yaml” 。
（3）本例程运行后的训练模型及训练日志保存在 “C:\Python\Projects\YOLOv13\runs\detect\train” 目录下。

在这里插入图片描述

3.4 训练日志

训练日志的图表对于评估和理解模型的性能非常重要，可以帮助我们分析模型的优势和不足。

训练结果保存在 runs\detect\train，训练日志的图表如下图所示。

在这里插入图片描述

请添加图片描述

训练完成后，训练的最优模型保存为文件： “./Projects/YOLO12_Animal/runs/detect/train/weights/best.py”。

- weights 文件夹
    - best.pt：损失值最小的模型文件
    - last.pt：训练到最后的模型文件
- args.yaml：模型训练的配置参数

混淆矩阵

confusion_matrix.png 展示了分类模型的性能。图中的每一行代表模型预测的类别，每一列代表实际的类别。对角线上的数值表示模型正确预测的数量。对角线上较深的颜色表示该类别预测正确的数量较多。
confusion_matrix_normalized.png：标准化混淆矩阵，显示每个类别的预测正确比例。

F1-置信度曲线

F1_curve.png：F1-置信度曲线，显示了F1得分随着置信度阈值的变化。
F1得分是精确度和召回率的调和平均值，曲线的峰值表示给定置信度阈值下精确度和召回率的最佳平衡点。

标签分布图和标签相关图

labels.jpg：标签分布图和边界框分布图。
柱状图显示了不同类别的实例分布数量。散点图则展示了目标检测任务中边界框的空间分布情况，反映了常见的尺寸和长宽比。
labels_correlogram.jpg：标签相关图
相关图提供了不同类别标签之间的关系，以及它们在图像中位置的相关性。这有助于理解模型在识别不同类别时可能出现的关联或混淆。

P/PR/R 曲线

P_curve.png：精确度-置信度曲线，展示了模型预测的精确度随着置信度阈值的变化。
精确度是模型预测正确正例与预测为正例总数的比值。
PR_curve.png：精确度-召回曲线，展示了模型的精确度与召回率之间的关系。
理想情况下，模型应在精确度和召回率之间保持良好的平衡。
R_curve.png：召回-置信度曲线，显示了模型的召回率随置信度阈值的变化。
召回率是模型正确预测的正例与实际正例总数的比值。

训练结果图表和数据

results.png 和 results.csv：训练结果图表和数据
展示了模型在训练过程中的性能变化，包括损失函数的变化和评估指标（如精确度、召回率和mAP）的变化。

3.5 断点训练

YOLO13 提供了参数 “resume” 进行断点训练。

对于大型数据集，使用 YOLO13 进行模型训练所需的时间很长，如果训练中断或者出现异常，可以接着从上一次中断时的模型继续训练。

将 “resume” 参数修改为 “True”，则会加载上一次训练的模型权重和优化器状态，继续从断点开始训练。
加载预训练模型权重文件时，使用上次中断的模型或最后一次训练的权重（last.pt）。

from ultralytics import YOLO

if __name__ == '__main__':
    # 创建 YOLO13 模型对象，加载指定的模型配置
    model = YOLO('yolov13.yaml')

    # 加载预训练的 YOLO13 权重文件
    model.load('yolov13n.pt')

    # 用指定数据集训练模型
    results = model.train(data=r'dataAquariumYolo.yaml',  # 指定训练数据集的配置文件路径
                          cache=False,  # 是否缓存数据集以加快后续训练速度
                          workers=4,  # 设置用于数据加载的线程数为4
                          device='0',  # 运行设备, 指定使用的 CPU/GPU 设备
                          epochs=100,  # 设置训练的总轮数为100轮
                          batch=64,  # 设置每个训练批次的大小为16
                          imgsz=640,  # 指定训练时使用的图像尺寸
                          scale=0.5,  # S:0.9; L:0.9; X:0.9
                          mixup=0.0,  # S:0.05; L:0.15; X:0.2
                          resume = 'True'  # 设置断点训练
                          )

4. 模型验证与模型预测

4.1 模型验证

训练后验证。

将训练好的模型 best.pt 保存在项目的根目录，并将文件名改为 “yolo13n_Aquarium.pt”。
模型验证程序如下。

from ultralytics import YOLO

if __name__ == '__main__':
    # 读取模型，传入训练好的模型
    model = YOLO('yolo13n_Aquarium.pt')
    # 验证模型
    metrics = model.val()  # 无需参数，使用 best.pt中的配置文件

运行模型验证程序，结果保存在 “.\runs\detect\val” 文件夹。

C:\Users\Administrator\.conda\envs\YOLO13\python.exe C:\Python\Projects\YOLOv13\Yolo13Aquarium_Val.py 
Ultralytics 8.3.63 🚀 Python-3.11.13 torch-2.3.1+cu121 CUDA:0 (NVIDIA GeForce RTX 3060, 12288MiB)
YOLOv13 summary: 535 layers, 2,449,260 parameters, 0 gradients, 6.2 GFLOPs
val: Scanning C:\Python\Projects\DatasetAquariumYolo\labels\valid.cache... 127 images, 0 backgrounds, 0 corrupt: 100%|                 

Class     Images  Instances      Box(P          R      mAP50      mAP75  mAP50-95): 100%|██████████| 8/8 [00:01<00:00,  4.57it/s]
                   all        127        909      0.724      0.674      0.718      0.431      0.436
                  fish         63        459      0.793      0.747      0.794      0.397      0.428
             jellyfish          9        155       0.77      0.923      0.916      0.502      0.518
               penguin         17        104      0.554      0.673      0.625      0.244      0.312
                puffin         15         74      0.602      0.405      0.478      0.212      0.224
                 shark         28         57      0.731      0.632      0.673      0.497      0.458
              starfish         17         27      0.854       0.65      0.757      0.502      0.521
              stingray         23         33      0.764      0.687      0.783      0.662      0.594
Speed: 0.9ms preprocess, 7.9ms inference, 0.0ms loss, 1.0ms postprocess per image
Results saved to runs\detect\val

4.2 模型预测

训练后验证。

将训练好的模型 best.pt 保存在项目的根目录，并改名为 “yolo13n_Aquarium.pt”。
模型预测程序如下。
参数 source 可以是一个或多个图片文件，一个视频文件，也可以是一个文件夹，或视频采集设备。

from ultralytics import YOLO

if __name__ == '__main__':
    # 读取模型，传入训练好的模型
    model = YOLO('yolo13n_Aquarium.pt')
    outputs = model.predict(source=f"C:\\Python\\Projects\\DatasetAquariumYolo\\images\\test", save=True)

运行模型预测程序，结果保存在 “.\runs\detect\predict” 文件夹。

3. YOLO26 快速入门

3.1 CLI 方式运行

命令行界面（command line interface, CLI）提供了一种直接使用 Ultralytics YOLO 模型的方法，可以直接从终端运行各种任务而无需 Python 环境。

YOLO26 支持使用命令行界面在各种任务和版本上训练、验证或推断模型，不需要定制或代码。

基本语法：
yolo 命令基本语法如下：

yolo TASK MODE ARGS

其中：
（1）MODE：设置模式，可设为 [train, val, predict, export, track, benchmark] 之一，必需项。
（2）TASK：设置视觉任务，可设为 [detect, segment, classify, pose, obb] 之一，可选项，缺省时将自动推断 TASK 。
（3）ARGS：自定义参数，可选项，参数必须以 arg=value 形式设置，用于覆盖默认值。不要使用 -- 参数前缀。多个参数之间用空格分隔，不要使用逗号分隔。
有关可用 ARGS，请参阅配置页面和 default.yaml.

模式（mode）：
Ultralytics YOLO 模型以不同的模式运行，默认值为 “train”。
（1）训练（train）：在自定义数据集上训练 YOLO 模型。
（2）验证（val）：验证已训练的 YOLO 模型。
（3）预测（predict）：使用训练好的 YOLO 模型对新图像或视频进行预测。
（4）导出（export）：导出 YOLO 模型以进行部署。
（5）跟踪（track）：使用 YOLO 模型实时跟踪对象。
（6）基准测试（benchmark）：对 YOLO 导出（ONNX、TensorRT 等）的速度和准确性进行基准测试。
视觉任务（task）：
（1）检测（detect）：物体检测识别并定位图像或视频中的物体；
（2）分割（segment）：实例分割将图像或视频划分为对应于不同对象或类别的区域；
（3）分类（classification）：图像分类预测输入图像的类别标签；
（4）姿势估计（pose）：姿势估计识别图像或视频中的对象并估计其关键点；
（5）旋转框检测（OBB）：旋转边界框使用旋转的边界框，适用于卫星或医学图像。
使用方法：

# 1. 训练：在 COCO8 数据集上训练 YOLO26n 模型，共 100 个 epoch，图像尺寸为 640
yolo detect train data=coco8.yaml model=yolo26n.pt epochs=100 imgsz=640

# 2. 预测：使用预训练的分割模型在YouTube视频上以 320 的图像尺寸进行预测
yolo predict model=yolo26n-seg.pt source='https://youtu.be/LNwODJXcvt4' imgsz=320

# 3. 验证：使用 1 张 640 大小的图像验证预训练的检测模型
yolo val model=yolo26n.pt data=coco8.yaml batch=1 imgsz=640

# 4. 导出：将 YOLO 分类模型导出为 ONNX 格式，图像大小为 224x128
yolo export model=yolo26n-cls.pt format=onnx imgsz=224,128

# 5. 特殊命令
yolo help  # 帮助命令，查看 YOLO 可用命令、参数和使用示例
yolo checks  # 检查 YOLO 运行环境
yolo version  # 查看当前安装的 YOLO 版本信息
yolo settings  # 管理 YOLO 的全局配置设置
yolo copy-cfg  # 将 YOLO 的默认配置文件复制到指定路径（或当前目录）
yolo cfg  # 查看或导出 YOLO 合并后的最终配置

应用实例：
使用 miniconda Prompt 命令行，或在 PyCharm 的命令行窗口，都可以以CLI 方式运行 yolo 命令进行物体检测任务，具体操作步骤如下：

（1）使用 miniconda Prompt 命令行，激活 YOLO26 虚拟环境，输入如下命令对指定图片进行检测。

conda activate YOLO26
cd c:\Python\Projects2025\Yolo26
yolo predict model=yolo26n.pt source="./data/images/bus.jpg"

3.2 Python 程序运行

YOLO26 也提供了 Python 接口的调用方式。它提供了加载和运行模型以及处理模型输出的函数。Python 接口设计简单、易于使用，用户可以快速实现对象检测、分割和分类。这使得 YOLO python 接口成为将这些功能集成到 python 项目中的宝贵工具。

使用预训练模型 YOLO26n.pt 进行推理的 Python 例程如下。

from ultralytics import YOLO

# 加载预训练的YOLO26模型
model = YOLO("YOLO26n.pt")

# 使用模型对图像执行对象检测
result = model(source="./data/images/roadflow.png", save=True)

# Export the model to ONNX format
success = model.export(format="onnx")

运行程序，就实现对指定图像文件的检测，并将检测结果保存到文件夹 “./runs/detect/predict”。

4.1 ROP 分类模型训练（原始图片）

C:\Python\Miniconda3\envs\YOLO26\python.exe C:\Python\PyProjects2026\YOLO26_ROP_cls1\Yolo26_cls_train1.py 
🚀 开始训练（YOLO11-CLS，ROP 分类） ...
WARNING 'label_smoothing' is deprecated and will be removed in the future.
Ultralytics 8.4.14  Python-3.8.20 torch-2.4.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4060 Ti, 16380MiB)
train: C:\Python\PyProjects2026\dataset_ROP_SZEH\images\train... found 822 images in 5 classes  
val: C:\Python\PyProjects2026\dataset_ROP_SZEH\images\val... found 163 images in 5 classes  
test: C:\Python\PyProjects2026\dataset_ROP_SZEH\images\test... found 114 images in 5 classes  
Overriding model.yaml nc=1000 with nc=5

...
100 epochs completed in 0.036 hours.

📈 Per-class Top-1 @ C:\Python\PyProjects2026\dataset_ROP_SZEH\images\train
class             top1_acc    correct/total
LaserScars          1.0000          257/257
Normal              1.0000          177/177
Stage1              1.0000            70/70
Stage2              0.9756          120/123
Stage3              1.0000          195/195
-------------------------------------------
OVERALL             0.9964          819/822


📈 Per-class Top-1 @ C:\Python\PyProjects2026\dataset_ROP_SZEH\images\val
class             top1_acc    correct/total
LaserScars          1.0000            51/51
Normal              1.0000            35/35
Stage1              0.7857            11/14
Stage2              0.7500            18/24
Stage3              0.8718            34/39
-------------------------------------------
OVERALL             0.9141          149/163

📊 训练结束，开始验证（val） ...
📈 Per-class Top-1 @ C:\Python\PyProjects2026\dataset_ROP_SZEH\images\test
class             top1_acc    correct/total
LaserScars          1.0000            35/35
Normal              1.0000            24/24
Stage1              0.5000             5/10
Stage2              0.6667            12/18
Stage3              0.9630            26/27
-------------------------------------------
OVERALL             0.8947          102/114

✅ 完成。输出目录： cls_rop\yolo26m_ROP_cls_exp

在这里插入图片描述

4.2 ROP 分类模型训练（预处理图片）

预处理方法详见：【AI辅助编程】ROP 图像预处理

C:\Python\Miniconda3\envs\YOLO26\python.exe C:\Python\PyProjects2026\YOLO26_ROP_cls1\Yolo26_cls_train1.py 
🚀 开始训练（YOLO11-CLS，ROP 分类） ...
WARNING 'label_smoothing' is deprecated and will be removed in the future.
Ultralytics 8.4.14  Python-3.8.20 torch-2.4.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4060 Ti, 16380MiB)
Overriding model.yaml nc=1000 with nc=5
...
Starting training for 100 epochs...


📈 Per-class Top-1 @ C:\Python\PyProjects2026\dataset_ROP_SZEH\images_preprocessed\train
class             top1_acc    correct/total
LaserScars          1.0000          257/257
Normal              1.0000          177/177
Stage1              0.9571            67/70
Stage2              0.9593          118/123
Stage3              0.9897          193/195
-------------------------------------------
OVERALL             0.9878          812/822


📈 Per-class Top-1 @ C:\Python\PyProjects2026\dataset_ROP_SZEH\images_preprocessed\val
class             top1_acc    correct/total
LaserScars          0.9804            50/51
Normal              0.9714            34/35
Stage1              0.5000             7/14
Stage2              0.7500            18/24
Stage3              0.9487            37/39
-------------------------------------------
OVERALL             0.8957          146/163


📈 Per-class Top-1 @ C:\Python\PyProjects2026\dataset_ROP_SZEH\images_preprocessed\test
class             top1_acc    correct/total
LaserScars          0.9714            34/35
Normal              0.9583            23/24
Stage1              0.4000             4/10
Stage2              0.4444             8/18
Stage3              0.9259            25/27
-------------------------------------------
OVERALL             0.8246           94/114

5. YOLO26 模型与参数解析

5.1 模型配置文件解析

以检测任务模型为例，YOLO26 的模型配置文件 “.\ultralytics\cfg\models\26\yolo26.yaml” 文件内容如下，主要包括模型参数（Parameters）、主干模型（backbone）和检测头（head）三部分。

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs
# Model docs: https://docs.ultralytics.com/models/yolo26
# Task docs: https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80 # number of classes
end2end: True # whether to use end-to-end mode
reg_max: 1 # DFL bins
scales: # model compound scaling constants, i.e. 'model=yolo26n.yaml' will call yolo26.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs
  s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs
  m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs
  l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs
  x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs

# YOLO26n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 2, C3k2, [256, False, 0.25]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 2, C3k2, [512, False, 0.25]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 2, C3k2, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 2, C3k2, [1024, True]]
  - [-1, 1, SPPF, [1024, 5, 3, True]] # 9
  - [-1, 2, C2PSA, [1024]] # 10

# YOLO26n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 2, C3k2, [512, True]] # 13

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 2, C3k2, [256, True]] # 16 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 13], 1, Concat, [1]] # cat head P4
  - [-1, 2, C3k2, [512, True]] # 19 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]] # cat head P5
  - [-1, 1, C3k2, [1024, True, 0.5, True]] # 22 (P5/32-large)

  - [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)

5.2 参数配置

YOLO26 的参数配置文件 “./ultralytics/cfg/default.yaml” 文件内容如下。

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Global configuration YAML with settings and hyperparameters for YOLO training, validation, prediction and export
# For documentation see https://docs.ultralytics.com/usage/cfg/

task: detect # (str) YOLO task, i.e. detect, segment, classify, pose, obb
mode: train # (str) YOLO mode, i.e. train, val, predict, export, track, benchmark

# Train settings -------------------------------------------------------------------------------------------------------
model: # (str, optional) path to model file, i.e. yolov8n.pt or yolov8n.yaml
data: # (str, optional) path to data file, i.e. coco8.yaml
epochs: 100 # (int) number of epochs to train for
time: # (float, optional) max hours to train; overrides epochs if set
patience: 100 # (int) early stop after N epochs without val improvement
batch: 16 # (int) batch size; use -1 for AutoBatch
imgsz: 640 # (int | list) train/val use int (square); predict/export may use [h,w]
save: True # (bool) save train checkpoints and predict results
save_period: -1 # (int) save checkpoint every N epochs; disabled if < 1
cache: False # (bool | str) cache images in RAM (True/'ram') or on 'disk' to speed dataloading; False disables
device: # (int | str | list) device: 0 or [0,1,2,3] for CUDA, 'cpu'/'mps', or -1/[-1,-1] to auto-select idle GPUs
workers: 8 # (int) dataloader workers (per RANK if DDP)
project: # (str, optional) project name for results root
name: # (str, optional) experiment name; results in 'project/name'
exist_ok: False # (bool) overwrite existing 'project/name' if True
pretrained: True # (bool | str) use pretrained weights (bool) or load weights from path (str)
optimizer: auto # (str) optimizer: SGD, Adam, Adamax, AdamW, NAdam, RAdam, RMSProp, or auto
verbose: True # (bool) print verbose logs during training/val
seed: 0 # (int) random seed for reproducibility
deterministic: True # (bool) enable deterministic ops; reproducible but may be slower
single_cls: False # (bool) treat all classes as a single class
rect: False # (bool) rectangular batches for train; rectangular batching for val when mode='val'
cos_lr: False # (bool) cosine learning rate scheduler
close_mosaic: 10 # (int) disable mosaic augmentation for final N epochs (0 to keep enabled)
resume: False # (bool) resume training from last checkpoint in the run dir
amp: True # (bool) Automatic Mixed Precision (AMP) training; True runs AMP capability check
fraction: 1.0 # (float) fraction of training dataset to use (1.0 = all)
profile: False # (bool) profile ONNX/TensorRT speeds during training for loggers
freeze: # (int | list, optional) freeze first N layers (int) or specific layer indices (list)
multi_scale: 0.0 # (float) multi-scale range as a fraction of imgsz; sizes are rounded to stride multiples
compile: False # (bool | str) enable torch.compile() backend='inductor'; True="default", False=off, or "default|reduce-overhead|max-autotune-no-cudagraphs"

# Segmentation
overlap_mask: True # (bool) merge instance masks into one mask during training (segment only)
mask_ratio: 4 # (int) mask downsample ratio (segment only)

# Classification
dropout: 0.0 # (float) dropout for classification head (classify only)

# Val/Test settings ----------------------------------------------------------------------------------------------------
val: True # (bool) run validation/testing during training
split: val # (str) dataset split to evaluate: 'val', 'test' or 'train'
save_json: False # (bool) save results to COCO JSON for external evaluation
conf: # (float, optional) confidence threshold; defaults: predict=0.25, val=0.001
iou: 0.7 # (float) IoU threshold used for NMS
max_det: 300 # (int) maximum number of detections per image
half: False # (bool) use half precision (FP16) if supported
dnn: False # (bool) use OpenCV DNN for ONNX inference
plots: True # (bool) save plots and images during train/val
end2end: # (bool, optional) whether to use end2end head(YOLO26, YOLOv10) for predict/val/export

# Predict settings -----------------------------------------------------------------------------------------------------
source: # (str, optional) path/dir/URL/stream for images or videos; e.g. 'ultralytics/assets' or '0' for webcam
vid_stride: 1 # (int) read every Nth frame for video sources
stream_buffer: False # (bool) True buffers all frames; False keeps the most recent frame for low-latency streams
visualize: False # (bool) visualize model features (predict) or TP/FP/FN confusion (val)
augment: False # (bool) apply test-time augmentation during prediction
agnostic_nms: False # (bool) class-agnostic NMS
classes: # (int | list[int], optional) filter by class id(s), e.g. 0 or [0,2,3]
retina_masks: False # (bool) use high-resolution segmentation masks (segment)
embed: # (list[int], optional) return feature embeddings from given layer indices

# Visualize settings ---------------------------------------------------------------------------------------------------
show: False # (bool) show images/videos in a window if supported
save_frames: False # (bool) save individual frames from video predictions
save_txt: False # (bool) save results as .txt files (xywh format)
save_conf: False # (bool) save confidence scores with results
save_crop: False # (bool) save cropped prediction regions to files
show_labels: True # (bool) draw class labels on images, e.g. 'person'
show_conf: True # (bool) draw confidence values on images, e.g. '0.99'
show_boxes: True # (bool) draw bounding boxes on images
line_width: # (int, optional) line width of boxes; auto-scales with image size if not set

# Export settings ------------------------------------------------------------------------------------------------------
format: torchscript # (str) target format, e.g. torchscript|onnx|openvino|engine|coreml|saved_model|pb|tflite|edgetpu|tfjs|paddle|mnn|ncnn|imx|rknn|executorch
keras: False # (bool) TF SavedModel only (format=saved_model); enable Keras layers during export
optimize: False # (bool) TorchScript only; apply mobile optimizations to the scripted model
int8: False # (bool) INT8/PTQ where supported (openvino, tflite, tfjs, engine, imx); needs calibration data/fraction
dynamic: False # (bool) dynamic shapes for torchscript, onnx, openvino, engine; enable variable image sizes
simplify: True # (bool) ONNX/engine only; run graph simplifier for cleaner ONNX before runtime conversion
opset: # (int, optional) ONNX/engine only; opset version for export; leave unset to use a tested default
workspace: # (float, optional) engine (TensorRT) only; workspace size in GiB, e.g. 4
nms: False # (bool) fuse NMS into exported model when backend supports; if True, conf/iou apply (agnostic_nms except coreml)

# Hyperparameters ------------------------------------------------------------------------------------------------------
lr0: 0.01 # (float) initial learning rate (SGD=1e-2, Adam/AdamW=1e-3)
lrf: 0.01 # (float) final LR fraction; final LR = lr0 * lrf
momentum: 0.937 # (float) SGD momentum or Adam beta1
weight_decay: 0.0005 # (float) weight decay (L2 regularization)
warmup_epochs: 3.0 # (float) warmup epochs (fractions allowed)
warmup_momentum: 0.8 # (float) initial momentum during warmup
warmup_bias_lr: 0.1 # (float) bias learning rate during warmup
box: 7.5 # (float) box loss gain
cls: 0.5 # (float) classification loss gain
dfl: 1.5 # (float) distribution focal loss gain
pose: 12.0 # (float) pose loss gain (pose tasks)
kobj: 1.0 # (float) keypoint objectness loss gain (pose tasks)
rle: 1.0 # (float) rle loss gain (pose tasks)
angle: 1.0 # (float) oriented angle loss gain (obb tasks)
nbs: 64 # (int) nominal batch size used for loss normalization
hsv_h: 0.015 # (float) HSV hue augmentation fraction
hsv_s: 0.7 # (float) HSV saturation augmentation fraction
hsv_v: 0.4 # (float) HSV value (brightness) augmentation fraction
degrees: 0.0 # (float) rotation degrees (+/-)
translate: 0.1 # (float) translation fraction (+/-)
scale: 0.5 # (float) scale gain (+/-)
shear: 0.0 # (float) shear degrees (+/-)
perspective: 0.0 # (float) perspective fraction (0–0.001 typical)
flipud: 0.0 # (float) vertical flip probability
fliplr: 0.5 # (float) horizontal flip probability
bgr: 0.0 # (float) RGB↔BGR channel swap probability
mosaic: 1.0 # (float) mosaic augmentation probability
mixup: 0.0 # (float) MixUp augmentation probability
cutmix: 0.0 # (float) CutMix augmentation probability
copy_paste: 0.0 # (float) segmentation copy-paste probability
copy_paste_mode: flip # (str) copy-paste strategy for segmentation: flip or mixup
auto_augment: randaugment # (str) classification auto augmentation policy: randaugment, autoaugment, augmix
erasing: 0.4 # (float) random erasing probability for classification (0–0.9), <1.0

# Custom config.yaml ---------------------------------------------------------------------------------------------------
cfg: # (str, optional) path to a config.yaml that overrides defaults

# Tracker settings ------------------------------------------------------------------------------------------------------
tracker: botsort.yaml # (str) tracker config file: botsort.yaml or bytetrack.yaml

训练设置
YOLO 模型的训练设置包括影响模型性能、速度和准确性的超参数和配置。关键设置包括批量大小、学习率、动量和权重衰减。优化器、损失函数和数据集组成的选定也会影响训练。调整和实验对于获得最佳性能至关重要。

参数	类型	默认值	描述
model	str	None	指定用于训练的模型文件。接受指向 .pt 预训练模型或 .yaml 配置文件的路径。对于定义模型结构或初始化权重至关重要。
data	str	None	数据集配置文件的路径（例如， coco8.yaml）。此文件包含数据集特定的参数，包括训练和验证数据的路径，类别名称和类别数量。
epochs	int	100	训练的总轮数。每个epoch代表对整个数据集的一次完整遍历。调整此值会影响训练时长和模型性能。
time	float	None	最长训练时间（以小时为单位）。如果设置此参数，它将覆盖 epochs 参数，允许训练在指定时长后自动停止。适用于时间受限的训练场景。
patience	int	100	在验证指标没有改善的情况下，等待多少个epoch后提前停止训练。通过在性能停滞时停止训练，有助于防止过拟合。
batch	int 或 float	16	批次大小，具有三种模式：设置为整数（例如， batch=16），自动模式，GPU 内存利用率为 60%（batch=-1），或具有指定利用率分数的自动模式（batch=0.70）。
imgsz	int	640	用于训练的目标图像大小。图像被调整为边长等于指定值的正方形（如果 rect=False），为 YOLO 模型保留宽高比，但不为 RT-DETR 保留。影响模型准确性和计算复杂度。
save	bool	True	启用保存训练检查点和最终模型权重。可用于恢复训练或模型部署。
save_period	int	-1	保存模型检查点的频率，以 epoch 为单位指定。值为 -1 时禁用此功能。适用于在长时间训练期间保存临时模型。
cache	bool	False	启用在内存中缓存数据集图像（True/ram），在磁盘上缓存（disk），或禁用缓存（False）。通过减少磁盘 I/O 来提高训练速度，但会增加内存使用量。
device	int 或 str 或 list	None	指定用于训练的计算设备：单个 GPU（device=0），多个 GPU（device=[0,1]），CPU（device=cpu），适用于 Apple 芯片的 MPS（device=mps），或自动选择最空闲的 GPU（device=-1）或多个空闲 GPU （device=[-1,-1])
workers	int	8	用于数据加载的工作线程数（每个 RANK ，如果是多 GPU 训练）。影响数据预处理和输入模型的速度，在多 GPU 设置中尤其有用。
project	str	None	项目目录的名称，训练输出保存在此目录中。允许有组织地存储不同的实验。
name	str	None	训练运行的名称。用于在项目文件夹中创建一个子目录，训练日志和输出存储在该子目录中。
exist_ok	bool	False	如果为 True，则允许覆盖现有的 project/name 目录。适用于迭代实验，无需手动清除之前的输出。
pretrained	bool 或 str	True	确定是否从预训练模型开始训练。可以是一个布尔值，也可以是加载权重的特定模型的字符串路径。增强训练效率和模型性能。
optimizer	str	‘auto’	训练优化器的选择。选项包括 SGD, MuSGD, Adam, Adamax, AdamW, NAdam, RAdam, RMSProp或 auto 用于基于模型配置自动选择。影响收敛速度和稳定性。
seed	int	0	设置训练的随机种子，确保在相同配置下运行结果的可重复性。
deterministic	bool	True	强制使用确定性算法，确保可重复性，但由于限制了非确定性算法，可能会影响性能和速度。
verbose	bool	True	在训练期间启用详细输出，在控制台中显示进度条、每 epoch 指标和额外的训练信息。
single_cls	bool	False	在多类别数据集中，将所有类别视为单个类别进行训练。适用于二元分类任务或侧重于对象是否存在而非分类时。
classes	list[int]	None	指定要训练的类 ID 列表。可用于在训练期间过滤掉并仅关注某些类。
rect	bool	False	启用最小填充策略——批量中的图像被最小程度地填充以达到一个共同的大小，最长边等于 imgsz。可以提高效率和速度，但可能会影响模型精度。
multi_scale	float	0.0	随机变化 imgsz 每批次误差为± multi_scale （例如 0.25 -> 0.75x 到 1.25x)，四舍五入至模型步长倍数； 0.0 禁用多尺度训练。
cos_lr	bool	False	使用余弦学习率调度器，在 epochs 上按照余弦曲线调整学习率。有助于管理学习率，从而实现更好的收敛。
close_mosaic	int	10	在最后 N 个 epochs 中禁用 mosaic 数据增强，以在完成前稳定训练。设置为 0 可禁用此功能。
resume	bool	False	从上次保存的检查点恢复训练。自动加载模型权重、优化器状态和 epoch 计数，无缝继续训练。
amp	bool	True	启用自动混合精度（AMP）训练，减少内存使用，并可能在对准确性影响最小的情况下加快训练速度。
fraction	float	1.0	指定用于训练的数据集比例。允许在完整数据集的子集上进行训练，这在实验或资源有限时非常有用。
profile	bool	False	在训练期间启用 ONNX 和 TensorRT 速度的分析，有助于优化模型部署。
freeze	int 或 list	None	冻结模型的前 N 层或按索引指定的层，从而减少可训练参数的数量。适用于微调或迁移学习。
lr0	float	0.01	初始学习率（即 SGD=1E-2, Adam=1E-3)。调整此值对于优化过程至关重要，它会影响模型权重更新的速度。
lrf	float	0.01	最终学习率作为初始速率的一部分 = (lr0 * lrf），与调度器结合使用以随时间调整学习率。
momentum	float	0.937	SGD 的动量因子或 Adam 优化器的 beta1，影响当前更新中过去梯度的整合。
weight_decay	float	0.0005	L2 正则化项，惩罚大权重以防止过拟合。
warmup_epochs	float	3.0	学习率预热的 epochs 数，将学习率从低值逐渐增加到初始学习率，以在早期稳定训练。
warmup_momentum	float	0.8	预热阶段的初始动量，在预热期间逐渐调整到设定的动量。
warmup_bias_lr	float	0.1	预热阶段偏差参数的学习率，有助于稳定初始 epochs 中的模型训练。
box	float	7.5	损失函数中框损失分量的权重，影响对准确预测边界框坐标的重视程度。
cls	float	0.5	分类损失在总损失函数中的权重，影响正确类别预测相对于其他成分的重要性。
dfl	float	1.5	分布焦点损失的权重，在某些 YOLO 版本中用于细粒度分类。
pose	float	12.0	在为姿势估计训练的模型中，姿势损失的权重会影响对准确预测姿势关键点的强调。
kobj	float	1.0	姿势估计模型中关键点对象性损失的权重，用于平衡检测置信度和姿势准确性。
rle	float	1.0	姿势估计模型中残差对数似然估计损失的权重，影响关键点定位的精度。
angle	float	1.0	obb模型中角度损失的权重，影响定向边界框角度预测的精度。
nbs	int	64	用于损失归一化的标称批量大小。
overlap_mask	bool	True	确定是否应将对象掩码合并为单个掩码以进行训练，还是为每个对象保持分离。如果发生重叠，则在合并期间，较小的掩码会覆盖在较大的掩码之上。
mask_ratio	int	4	分割掩码的下采样率，影响训练期间使用的掩码分辨率。
dropout	float	0.0	分类任务中用于正则化的 Dropout 率，通过在训练期间随机省略单元来防止过拟合。
val	bool	True	在训练期间启用验证，从而可以定期评估模型在单独数据集上的性能。
plots	bool	True	生成并保存训练和验证指标的图表，以及预测示例，从而提供对模型性能和学习进度的可视化见解。
compile	bool 或 str	False	启用 PyTorch 2.x torch.compile 使用以下方式进行图形编译 backend=‘inductor’。接受 True → “default”, False → 禁用，或字符串模式，例如 “default”, “reduce-overhead”, “max-autotune-no-cudagraphs”。如果不支持，则会发出警告并回退到 Eager 模式。
max_det	int	300	指定在训练的验证阶段保留的最大对象数量。

6. 小结

我们再回顾一下 YOLO26 的安装与使用的基本步骤如下。

# 1. 安装 Anaconda、Pycharm

# 2. 创建 Python 环境
conda create -n YOLO26 python=3.8 -y

# 3. 激活环境
conda activate YOLO26

# 4. 安装 PyTorch（示例）
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121

# 5. 安装 Ultralytics
pip install -U ultralytics

# 6. 下载预训练模型权重
wget https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo26n.pt
wget https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo26s-seg.pt #（示例）
wget https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo26m-cls.pt #（示例）

本文详细介绍了YOLO26的完整安装配置流程与快速使用方法，内容涵盖从环境搭建到模型应用的各个环节。

通过本文的学习，读者可以独立完成YOLO26的安装配置，并运行基本的检测任务。在下一篇内容中，我们将进一步深入，介绍如何使用自己的数据集训练YOLO26模型，包括数据集准备、标注格式转换、配置文件修改以及训练过程的监控与优化。

【本节完】

如果您在研究中使用了 YOLO26，请引用原作：

@software{YOLO13,
  author = {Tian, Yunjie and Ye, Qixiang and Doermann, David},
  title = {YOLOv13: Attention-Centric Real-Time Object Detectors},
  year = {2025},
  url = {https://github.com/sunsmarterjie/YOLOv13},
  license = {AGPL-3.0}
}