【YOLO改进】换遍IoU损失函数之Innerciou Loss(基于MMYOLO)

替换Inner CIoU损失函数(基于MMYOLO)

由于MMYOLO中没有实现Inner CIoU损失函数,所以需要在mmyolo/models/iou_loss.py中添加Inner CIoU的计算和对应的iou_mode,修改完以后在终端运行

python setup.py install

再在配置文件中进行修改即可。修改例子如下:

    elif iou_mode == "innerciou":
        ratio=1.0
        w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
        x1 = bbox1_x1 + w1_
        y1 = bbox1_y1 + h1_
        x2 = bbox2_x1 + w2_
        y2 = bbox2_y1 + h2_

        inner_b1_x1, inner_b1_x2, inner_b1_y1, inner_b1_y2 = x1 - w1_ * ratio, x1 + w1_ * ratio, \
                                                             y1 - h1_ * ratio, y1 + h1_ * ratio
        inner_b2_x1, inner_b2_x2, inner_b2_y1, inner_b2_y2 = x2 - w2_ * ratio, x2 + w2_ * ratio, \
                                                             y2 - h2_ * ratio, y2 + h2_ * ratio
        inner_inter = (torch.min(inner_b1_x2, inner_b2_x2) - torch.max(inner_b1_x1, inner_b2_x1)).clamp(0) * \
                      (torch.min(inner_b1_y2, inner_b2_y2) - torch.max(inner_b1_y1, inner_b2_y1)).clamp(0)
        inner_union = w1 * ratio * h1 * ratio + w2 * ratio * h2 * ratio - inner_inter + eps
        inner_iou = inner_inter / inner_union

        # CIoU = IoU - ( (ρ^2(b_pred,b_gt) / c^2) + (alpha x v) )

        # calculate enclose area (c^2)
        enclose_area = enclose_w**2 + enclose_h**2 + eps

        # calculate ρ^2(b_pred,b_gt):
        # euclidean distance between b_pred(bbox2) and b_gt(bbox1)
        # center point, because bbox format is xyxy -> left-top xy and
        # right-bottom xy, so need to / 4 to get center point.
        rho2_left_item = ((bbox2_x1 + bbox2_x2) - (bbox1_x1 + bbox1_x2))**2 / 4
        rho2_right_item = ((bbox2_y1 + bbox2_y2) -
                           (bbox1_y1 + bbox1_y2))**2 / 4
        rho2 = rho2_left_item + rho2_right_item  # rho^2 (ρ^2)

        # Width and height ratio (v)
        wh_ratio = (4 / (math.pi**2)) * torch.pow(
            torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)

        with torch.no_grad():
            alpha = wh_ratio / (wh_ratio - ious + (1 + eps))

        # innerCIoU
        ious = inner_iou - ((rho2 / enclose_area) + (alpha * wh_ratio))

修改后的配置文件(以configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py为例)

_base_ = ['../_base_/default_runtime.py', '../_base_/det_p5_tta.py']

# ========================Frequently modified parameters======================
# -----data related-----
data_root = 'data/coco/'  # Root path of data
# Path of train annotation file
train_ann_file = 'annotations/instances_train2017.json'
train_data_prefix = 'train2017/'  # Prefix of train image path
# Path of val annotation file
val_ann_file = 'annotations/instances_val2017.json'
val_data_prefix = 'val2017/'  # Prefix of val image path

num_classes = 80  # Number of classes for classification
# Batch size of a single GPU during training
train_batch_size_per_gpu = 16
# Worker to pre-fetch data for each single GPU during training
train_num_workers = 8
# persistent_workers must be False if num_workers is 0
persistent_workers = True

# -----model related-----
# Basic size of multi-scale prior box
anchors = [
    [(10, 13), (16, 30), (33, 23)],  # P3/8
    [(30, 61), (62, 45), (59, 119)],  # P4/16
    [(116, 90), (156, 198), (373, 326)]  # P5/32
]

# -----train val related-----
# Base learning rate for optim_wrapper. Corresponding to 8xb16=128 bs
base_lr = 0.01
max_epochs = 300  # Maximum training epochs

model_test_cfg = dict(
    # The config of multi-label for multi-class prediction.
    multi_label=True,
    # The number of boxes before NMS
    nms_pre=30000,
    score_thr=0.001,  # Threshold to filter out boxes.
    nms=dict(type='nms', iou_threshold=0.65),  # NMS type and threshold
    max_per_img=300)  # Max number of detections of each image

# ========================Possible modified parameters========================
# -----data related-----
img_scale = (640, 640)  # width, height
# Dataset type, this will be used to define the dataset
dataset_type = 'YOLOv5CocoDataset'
# Batch size of a single GPU during validation
val_batch_size_per_gpu = 1
# Worker to pre-fetch data for each single GPU during validation
val_num_workers = 2

# Config of batch shapes. Only on val.
# It means not used if batch_shapes_cfg is None.
batch_shapes_cfg = dict(
    type='BatchShapePolicy',
    batch_size=val_batch_size_per_gpu,
    img_size=img_scale[0],
    # The image scale of padding should be divided by pad_size_divisor
    size_divisor=32,
    # Additional paddings for pixel scale
    extra_pad_ratio=0.5)

# -----model related-----
# The scaling factor that controls the depth of the network structure
deepen_factor = 0.33
# The scaling factor that controls the width of the network structure
widen_factor = 0.5
# Strides of multi-scale prior box
strides = [8, 16, 32]
num_det_layers = 3  # The number of model output scales
norm_cfg = dict(type='BN', momentum=0.03, eps=0.001)  # Normalization config

# -----train val related-----
affine_scale = 0.5  # YOLOv5RandomAffine scaling ratio
loss_cls_weight = 0.5
loss_bbox_weight = 0.05
loss_obj_weight = 1.0
prior_match_thr = 4.  # Priori box matching threshold
# The obj loss weights of the three output layers
obj_level_weights = [4., 1., 0.4]
lr_factor = 0.01  # Learning rate scaling factor
weight_decay = 0.0005
# Save model checkpoint and validation intervals
save_checkpoint_intervals = 10
# The maximum checkpoints to keep.
max_keep_ckpts = 3
# Single-scale training is recommended to
# be turned on, which can speed up training.
env_cfg = dict(cudnn_benchmark=True)

# ===============================Unmodified in most cases====================
model = dict(
    type='YOLODetector',
    data_preprocessor=dict(
        type='mmdet.DetDataPreprocessor',
        mean=[0., 0., 0.],
        std=[255., 255., 255.],
        bgr_to_rgb=True),
    backbone=dict(
        ##使用YOLOv8的主干网络

        type='YOLOv8CSPDarknet',
        deepen_factor=deepen_factor,
        widen_factor=widen_factor,
        norm_cfg=norm_cfg,
        act_cfg=dict(type='SiLU', inplace=True)

    ),
    neck=dict(
        type='YOLOv5PAFPN',
        deepen_factor=deepen_factor,
        widen_factor=widen_factor,
        in_channels=[256, 512, 1024],
        out_channels=[256, 512, 1024],
        num_csp_blocks=3,
        norm_cfg=norm_cfg,
        act_cfg=dict(type='SiLU', inplace=True)),
    bbox_head=dict(
        type='YOLOv5Head',
        head_module=dict(
            type='YOLOv5HeadModule',
            num_classes=num_classes,
            in_channels=[256, 512, 1024],
            widen_factor=widen_factor,
            featmap_strides=strides,
            num_base_priors=3),
        prior_generator=dict(
            type='mmdet.YOLOAnchorGenerator',
            base_sizes=anchors,
            strides=strides),
        # scaled based on number of detection layers
        loss_cls=dict(
            type='mmdet.CrossEntropyLoss',
            use_sigmoid=True,
            reduction='mean',
            loss_weight=loss_cls_weight *
            (num_classes / 80 * 3 / num_det_layers)),
        # 修改此处实现IoU损失函数的替换
        loss_bbox=dict(
            type='IoULoss',
            iou_mode='innerciou',
            bbox_format='xywh',
            eps=1e-7,
            reduction='mean',
            loss_weight=loss_bbox_weight * (3 / num_det_layers),
            return_iou=True),
        loss_obj=dict(
            type='mmdet.CrossEntropyLoss',
            use_sigmoid=True,
            reduction='mean',
            loss_weight=loss_obj_weight *
            ((img_scale[0] / 640)**2 * 3 / num_det_layers)),
        prior_match_thr=prior_match_thr,
        obj_level_weights=obj_level_weights),
    test_cfg=model_test_cfg)

albu_train_transforms = [
    dict(type='Blur', p=0.01),
    dict(type='MedianBlur', p=0.01),
    dict(type='ToGray', p=0.01),
    dict(type='CLAHE', p=0.01)
]

pre_transform = [
    dict(type='LoadImageFromFile', file_client_args=_base_.file_client_args),
    dict(type='LoadAnnotations', with_bbox=True)
]

train_pipeline = [
    *pre_transform,
    dict(
        type='Mosaic',
        img_scale=img_scale,
        pad_val=114.0,
        pre_transform=pre_transform),
    dict(
        type='YOLOv5RandomAffine',
        max_rotate_degree=0.0,
        max_shear_degree=0.0,
        scaling_ratio_range=(1 - affine_scale, 1 + affine_scale),
        # img_scale is (width, height)
        border=(-img_scale[0] // 2, -img_scale[1] // 2),
        border_val=(114, 114, 114)),
    dict(
        type='mmdet.Albu',
        transforms=albu_train_transforms,
        bbox_params=dict(
            type='BboxParams',
            format='pascal_voc',
            label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),
        keymap={
            'img': 'image',
            'gt_bboxes': 'bboxes'
        }),
    dict(type='YOLOv5HSVRandomAug'),
    dict(type='mmdet.RandomFlip', prob=0.5),
    dict(
        type='mmdet.PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',
                   'flip_direction'))
]

train_dataloader = dict(
    batch_size=train_batch_size_per_gpu,
    num_workers=train_num_workers,
    persistent_workers=persistent_workers,
    pin_memory=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=train_ann_file,
        data_prefix=dict(img=train_data_prefix),
        filter_cfg=dict(filter_empty_gt=False, min_size=32),
        pipeline=train_pipeline))

test_pipeline = [
    dict(type='LoadImageFromFile', file_client_args=_base_.file_client_args),
    dict(type='YOLOv5KeepRatioResize', scale=img_scale),
    dict(
        type='LetterResize',
        scale=img_scale,
        allow_scale_up=False,
        pad_val=dict(img=114)),
    dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
    dict(
        type='mmdet.PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor', 'pad_param'))
]

val_dataloader = dict(
    batch_size=val_batch_size_per_gpu,
    num_workers=val_num_workers,
    persistent_workers=persistent_workers,
    pin_memory=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        test_mode=True,
        data_prefix=dict(img=val_data_prefix),
        ann_file=val_ann_file,
        pipeline=test_pipeline,
        batch_shapes_cfg=batch_shapes_cfg))

test_dataloader = val_dataloader

param_scheduler = None
optim_wrapper = dict(
    type='OptimWrapper',
    optimizer=dict(
        type='SGD',
        lr=base_lr,
        momentum=0.937,
        weight_decay=weight_decay,
        nesterov=True,
        batch_size_per_gpu=train_batch_size_per_gpu),
    constructor='YOLOv5OptimizerConstructor')

default_hooks = dict(
    param_scheduler=dict(
        type='YOLOv5ParamSchedulerHook',
        scheduler_type='linear',
        lr_factor=lr_factor,
        max_epochs=max_epochs),
    checkpoint=dict(
        type='CheckpointHook',
        interval=save_checkpoint_intervals,
        save_best='auto',
        max_keep_ckpts=max_keep_ckpts))

custom_hooks = [
    dict(
        type='EMAHook',
        ema_type='ExpMomentumEMA',
        momentum=0.0001,
        update_buffers=True,
        strict_load=False,
        priority=49)
]

val_evaluator = dict(
    type='mmdet.CocoMetric',
    proposal_nums=(100, 1, 10),
    ann_file=data_root + val_ann_file,
    metric='bbox')
test_evaluator = val_evaluator

train_cfg = dict(
    type='EpochBasedTrainLoop',
    max_epochs=max_epochs,
    val_interval=save_checkpoint_intervals)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mfbz.cn/a/591862.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

Idea 自动生成测试

先添加测试依赖&#xff01;&#xff01; <!--Junit单元测试依赖--><dependency><groupId>org.junit.jupiter</groupId><artifactId>junit-jupiter</artifactId><version>5.9.1</version><scope>test</scope><…

MATLAB 集成

MATLAB 集成&#xff08;Integration&#xff09; 集成处理两种本质上不同的问题。 在第一种类型中&#xff0c;给出了函数的导数&#xff0c;我们想找到函数。因此&#xff0c;我们从根本上扭转了分化的过程。这种反向过程称为反微分&#xff0c;或者找到原始函数&#xff0…

基于SSM的宠物领养平台(有报告)。Javaee项目。ssm项目。

演示视频&#xff1a; 基于SSM的宠物领养平台&#xff08;有报告&#xff09;。Javaee项目。ssm项目。 项目介绍&#xff1a; 采用M&#xff08;model&#xff09;V&#xff08;view&#xff09;C&#xff08;controller&#xff09;三层体系结构&#xff0c;通过Spring Spri…

专项技能训练五《云计算网络技术与应用》实训7-1:安装mininet

文章目录 mininet安装1. 按6-1教程安装opendaylight控制器。2. 按6-2教程安装RYU控制器。3. 按5-1教程安装openvswitch虚拟交换机并开启服务。4. 将老师所给mininet安装包试用winSCP传送至电脑端。5. 安装net-tools。6. 安装mininet7. 安装完成后&#xff0c;使用命令建立拓扑&…

Stable Diffusion webUI 配置指南

Stable Diffusion webUI 配置指南 本博客主要介绍部署Stable Diffusion到本地&#xff0c;生成想要的风格图片。 文章目录 Stable Diffusion webUI 配置指南1、配置环境&#xff08;1&#xff09;pip环境[可选]&#xff08;2&#xff09;conda环境[可选] 2、配置Stable Diffu…

JavaScript 动态网页实例 —— 文字移动

前言 介绍文字使用的特殊效果。本章介绍文字的移动效果,主要包括:文字的垂直滚动、文字的渐隐渐显、文字的闪烁显示、文字的随意拖动、文字的坠落显示、页面内飘动的文字、漫天飞舞的文字、文字的下落效果。对于这些效果,读者只需稍加修改,就可以应用在自己的页面设计中。 …

农作物害虫检测数据集VOC+YOLO格式3575张10类别

数据集格式&#xff1a;Pascal VOC格式YOLO格式(不包含分割路径的txt文件&#xff0c;仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件) 图片数量(jpg文件个数)&#xff1a;3575 标注数量(xml文件个数)&#xff1a;3575 标注数量(txt文件个数)&#xff1a;3575 标注…

电话号码的字母组合 【C++】【力扣刷题】

解题思路&#xff1a; 以第一个为例,digits “23”&#xff0c;表明从电话号码的按键中选取2和3这两个字符&#xff0c;然后去寻找它们各自所对应的字母&#xff0c;这里每一个数字字符所对应的字母的不同&#xff0c;0对应的是空字符&#xff0c;而1的话题目中讲到是不对应任…

中药辨别二

声明&#xff1a;参考懒兔子公益课&#xff0c;参考网络资料和部分网络图片整理而成&#xff0c;仅供学习使用&#xff0c;不提供商业活动价值&#xff0c;文章描述的中药仅供学习&#xff0c;请在专业医师或专业医生指导下使用药材&#xff0c;擅自或其他情况下使用&#xff0…

LeetCode406:根据身高重建队列

题目描述 假设有打乱顺序的一群人站成一个队列&#xff0c;数组 people 表示队列中一些人的属性&#xff08;不一定按顺序&#xff09;。每个 people[i] [hi, ki] 表示第 i 个人的身高为 hi &#xff0c;前面 正好 有 ki 个身高大于或等于 hi 的人。 请你重新构造并返回输入数…

初学python记录:力扣1235. 规划兼职工作

题目&#xff1a; 你打算利用空闲时间来做兼职工作赚些零花钱。 这里有 n 份兼职工作&#xff0c;每份工作预计从 startTime[i] 开始到 endTime[i] 结束&#xff0c;报酬为 profit[i]。 给你一份兼职工作表&#xff0c;包含开始时间 startTime&#xff0c;结束时间 endTime …

[嵌入式AI从0开始到入土]17_Ascend C算子开发

[嵌入式AI从0开始到入土]嵌入式AI系列教程 注&#xff1a;等我摸完鱼再把链接补上 可以关注我的B站号工具人呵呵的个人空间&#xff0c;后期会考虑出视频教程&#xff0c;务必催更&#xff0c;以防我变身鸽王。 第1期 昇腾Altas 200 DK上手 第2期 下载昇腾案例并运行 第3期 官…

JDK14特性

JDK14 1 概述2 语法层面的变化1_instanceof的模式匹配(预览)2_switch表达式(标准)3_文本块改进(第二次预览)4_Records 记录类型(预览 JEP359) 3 API层面的变化4 关于GC1_G1的NUMA内存分配优化2_弃用SerialCMS,ParNewSerial Old3_删除CMS4_ZGC on macOS and Windows 4 其他变化1…

PPT基础

5种ppt仅可读形式 Ⅰ 开始选项卡 1.【幻灯片】组中&#xff1a;新建幻灯片&#xff0c;从大纲中导入幻灯片&#xff1b;修改幻灯片的版式&#xff1b;节&#xff08;新增节&#xff0c;重命名节&#xff09;。 2.【字体】组中&#xff1a;设置字体&#xff0c;字体大小&…

ctfshow web入门 sql注入 web224--web233

web224 扫描后台&#xff0c;发现robots.txt&#xff0c;访问发现/pwdreset.php &#xff0c;再访问可以重置密码 &#xff0c;登录之后发现上传文件 检查发现没有限制诶 上传txt,png,zip发现文件错误了 后面知道群里有个文件能上传 <? _$GET[1]_?>就是0x3c3f3d60245…

海外仓系统与跨境电商平台集成:有什么意义,为什么重要

跨境电商的发展趋势并没有丝毫放缓的迹象&#xff0c;这使得对高效率、综合性的海外仓的需求变得比以往任何时间都要多。 预测表明&#xff0c;未来一年跨境电商的市场份额将继续扩大。这一切都要求海外仓企业尽快提升仓储管理效率&#xff0c;在这个过程中&#xff0c;海外仓系…

小苹果

题目描述 小的桌子上放着几个苹果从左到右排成一列&#xff0c;编号为从1 到 。小苞是小的好朋友&#xff0c;每天她都会从中拿走一些苹果。每天在拿的时候&#xff0c;小苞都是从左侧第1个苹果开始、每隔2个苹果拿走1个苹果。随后小苞会将剩下的苹果按原先的顺序重新排成一列…

扩展学习|本体研究进展

文献来源&#xff1a; 王向前,张宝隆,李慧宗.本体研究综述[J].情报杂志,2016,35(06):163-170. 一、本体的定义 本体概念被引入人工智能、知识工程等领域后被赋予了新的含义。然而不同的专家学者对本体的理解不同,所给出的定义也有所差异。 人工智能领域的学者Neches(1991)等人对…

StampedLock(戳记锁)源码解读与使用

&#x1f3f7;️个人主页&#xff1a;牵着猫散步的鼠鼠 &#x1f3f7;️系列专栏&#xff1a;Java源码解读-专栏 &#x1f3f7;️个人学习笔记&#xff0c;若有缺误&#xff0c;欢迎评论区指正 1. 前言 我们在上一篇写ReentrantReadWriteLock读写锁的末尾留了一个小坑&#…

这书不错,古琴乐理实用教程(尹溧新编),有课学得通透。

通篇阅读后&#xff0c;发现这本书以古琴初习者、未系统接触过现代乐理的读者为对象&#xff0c;将复杂的古琴音乐理论简单化、通俗化。书中采用参照比较的方法、通俗易懂的语言、言简意赅的文字&#xff0c;并结合具体音乐作品将古琴研习中最主要的、最核心的理论知识进行简明…
最新文章