MobileNetV3

相对重量级网络而言,轻量级网络的特点是参数少、计算量小、推理时间短。更适用于存储空间和功耗受限的场景,例如移动端嵌入式设备等边缘计算设备。因此轻量级网络受到了广泛的关注,其中MobileNet可谓是其中的佼佼者。MobileNetV3经过了V1和V2前两代的积累,性能和速度都表现优异,MobileNetV3 参数是由NAS(network architecture search)搜索获取的,又继承的V1和V2的一些实用成果,并引人SE通道注意力机制,可谓集大成者。本文以应用为主,结合代码剖析MobileNetV3的网络结构。

主要特点

  1. 论文推出两个版本:Large 和 Small,分别适用于不同的场景;
  2. 使用NetAdapt算法获得卷积核和通道的最佳数量;
  3. 继承V1的深度可分离卷积;
  4. 继承V2的具有线性瓶颈的残差结构;
  5. 引入SE通道注意力结构;
  6. 使用了一种新的激活函数h-swish(x)代替Relu6,h的意思表示hard;
  7. 使用了Relu6(x + 3)/6来近似SE模块中的sigmoid;
  8. 修改了MobileNetV2后端输出head

整体结构

 上图为MobileNetV3的网络结构图,large和small的整体结构一致,区别就是基本单元bneck的个数以及内部参数上,主要是通道数目。(左图为small,右图为large)

上表为具体的参数设置,其中bneck是网络的基本结构。SE代表是否使用通道注意力机制。NL代表激活函数的类型,包括HS(h-swish),RE(ReLU)。NBN 代表没有BN操作。 s 是stride的意思,网络使用卷积stride操作进行降采样,没有使用pooling操作。

 

pytorch官方代码MobileNetV3

给出修改后的 代码

from typing import Callable, List, Optional

import torch
from torch import nn, Tensor
from torch.nn import functional as F
from functools import partial


def _make_divisible(ch, divisor=8, min_ch=None):
    """
    This function is taken from the original tf repo.
    It ensures that all layers have a channel number that is divisible by 8
    It can be seen here:
    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
    """
    if min_ch is None:
        min_ch = divisor
    new_ch = max(min_ch, int(ch + divisor / 2) // divisor * divisor)
    # Make sure that round down does not go down by more than 10%.
    if new_ch < 0.9 * ch:
        new_ch += divisor
    return new_ch


class ConvBNActivation(nn.Sequential):
    def __init__(self,
                 in_planes: int,
                 out_planes: int,
                 kernel_size: int = 3,
                 stride: int = 1,
                 groups: int = 1,
                 norm_layer: Optional[Callable[..., nn.Module]] = None,
                 activation_layer: Optional[Callable[..., nn.Module]] = None):
        padding = (kernel_size - 1) // 2
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        if activation_layer is None:
            activation_layer = nn.ReLU6
        super(ConvBNActivation, self).__init__(nn.Conv2d(in_channels=in_planes,
                                                         out_channels=out_planes,
                                                         kernel_size=kernel_size,
                                                         stride=stride,
                                                         padding=padding,
                                                         groups=groups,
                                                         bias=False),
                                               norm_layer(out_planes),
                                               activation_layer(inplace=True))


class SqueezeExcitation(nn.Module):
    def __init__(self, input_c: int, squeeze_factor: int = 4):
        super(SqueezeExcitation, self).__init__()
        squeeze_c = _make_divisible(input_c // squeeze_factor, 8)
        self.fc1 = nn.Conv2d(input_c, squeeze_c, 1)
        self.fc2 = nn.Conv2d(squeeze_c, input_c, 1)

    def forward(self, x: Tensor) -> Tensor:
        scale = F.adaptive_avg_pool2d(x, output_size=(1, 1))
        scale = self.fc1(scale)
        scale = F.relu(scale, inplace=True)
        scale = self.fc2(scale)
        scale = F.hardsigmoid(scale, inplace=True)
        return scale * x


class InvertedResidualConfig:
    def __init__(self,
                 input_c: int,
                 kernel: int,
                 expanded_c: int,
                 out_c: int,
                 use_se: bool,
                 activation: str,
                 stride: int,
                 width_multi: float):
        self.input_c = self.adjust_channels(input_c, width_multi)
        self.kernel = kernel
        self.expanded_c = self.adjust_channels(expanded_c, width_multi)
        self.out_c = self.adjust_channels(out_c, width_multi)
        self.use_se = use_se
        self.use_hs = activation == "HS"  # whether using h-swish activation
        self.stride = stride

    @staticmethod
    def adjust_channels(channels: int, width_multi: float):
        return _make_divisible(channels * width_multi, 8)


class InvertedResidual(nn.Module):
    def __init__(self,
                 cnf: InvertedResidualConfig,
                 norm_layer: Callable[..., nn.Module]):
        super(InvertedResidual, self).__init__()

        if cnf.stride not in [1, 2]:
            raise ValueError("illegal stride value.")

        self.use_res_connect = (cnf.stride == 1 and cnf.input_c == cnf.out_c)

        layers: List[nn.Module] = []
        activation_layer = nn.Hardswish if cnf.use_hs else nn.ReLU

        # expand
        if cnf.expanded_c != cnf.input_c:
            layers.append(ConvBNActivation(cnf.input_c,
                                           cnf.expanded_c,
                                           kernel_size=1,
                                           norm_layer=norm_layer,
                                           activation_layer=activation_layer))

        # depthwise
        layers.append(ConvBNActivation(cnf.expanded_c,
                                       cnf.expanded_c,
                                       kernel_size=cnf.kernel,
                                       stride=cnf.stride,
                                       groups=cnf.expanded_c,
                                       norm_layer=norm_layer,
                                       activation_layer=activation_layer))

        if cnf.use_se:
            layers.append(SqueezeExcitation(cnf.expanded_c))

        # project
        layers.append(ConvBNActivation(cnf.expanded_c,
                                       cnf.out_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Identity))

        self.block = nn.Sequential(*layers)
        self.out_channels = cnf.out_c
        self.is_strided = cnf.stride > 1

    def forward(self, x: Tensor) -> Tensor:
        result = self.block(x)
        if self.use_res_connect:
            result += x

        return result


class MobileNetV3(nn.Module):
    def __init__(self,
                 inverted_residual_setting: List[InvertedResidualConfig],
                 last_channel: int,
                 num_classes: int = 1000,
                 block: Optional[Callable[..., nn.Module]] = None,
                 norm_layer: Optional[Callable[..., nn.Module]] = None):
        super(MobileNetV3, self).__init__()

        if not inverted_residual_setting:
            raise ValueError("The inverted_residual_setting should not be empty.")
        elif not (isinstance(inverted_residual_setting, List) and
                  all([isinstance(s, InvertedResidualConfig) for s in inverted_residual_setting])):
            raise TypeError("The inverted_residual_setting should be List[InvertedResidualConfig]")

        if block is None:
            block = InvertedResidual

        if norm_layer is None:
            norm_layer = partial(nn.BatchNorm2d, eps=0.001, momentum=0.01)

        layers: List[nn.Module] = []

        # building first layer
        firstconv_output_c = inverted_residual_setting[0].input_c
        layers.append(ConvBNActivation(3,
                                       firstconv_output_c,
                                       kernel_size=3,
                                       stride=2,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))
        # building inverted residual blocks
        for cnf in inverted_residual_setting:
            layers.append(block(cnf, norm_layer))

        # building last several layers
        lastconv_input_c = inverted_residual_setting[-1].out_c
        lastconv_output_c = 6 * lastconv_input_c
        layers.append(ConvBNActivation(lastconv_input_c,
                                       lastconv_output_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))
        self.features = nn.Sequential(*layers)
        self.avgpool = nn.AdaptiveAvgPool2d(1)
        self.classifier = nn.Sequential(nn.Linear(lastconv_output_c, last_channel),
                                        nn.Hardswish(inplace=True),
                                        nn.Dropout(p=0.2, inplace=True),
                                        nn.Linear(last_channel, num_classes))

        # initial weights
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode="fan_out")
                if m.bias is not None:
                    nn.init.zeros_(m.bias)
            elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
                nn.init.ones_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.zeros_(m.bias)

    def _forward_impl(self, x: Tensor) -> Tensor:
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)

        return x

    def forward(self, x: Tensor) -> Tensor:
        return self._forward_impl(x)


def mobilenet_v3_large(num_classes: int = 1000,
                       reduced_tail: bool = False) -> MobileNetV3:
    """
    Constructs a large MobileNetV3 architecture from
    "Searching for MobileNetV3" <https://arxiv.org/abs/1905.02244>.

    weights_link:
    https://download.pytorch.org/models/mobilenet_v3_large-8738ca79.pth

    Args:
        num_classes (int): number of classes
        reduced_tail (bool): If True, reduces the channel counts of all feature layers
            between C4 and C5 by 2. It is used to reduce the channel redundancy in the
            backbone for Detection and Segmentation.
    """
    width_multi = 1.0
    bneck_conf = partial(InvertedResidualConfig, width_multi=width_multi)
    adjust_channels = partial(InvertedResidualConfig.adjust_channels, width_multi=width_multi)

    reduce_divider = 2 if reduced_tail else 1

    inverted_residual_setting = [
        # input_c, kernel, expanded_c, out_c, use_se, activation, stride
        bneck_conf(16, 3, 16, 16, False, "RE", 1),
        bneck_conf(16, 3, 64, 24, False, "RE", 2),  # C1
        bneck_conf(24, 3, 72, 24, False, "RE", 1),
        bneck_conf(24, 5, 72, 40, True, "RE", 2),  # C2
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 3, 240, 80, False, "HS", 2),  # C3
        bneck_conf(80, 3, 200, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 480, 112, True, "HS", 1),
        bneck_conf(112, 3, 672, 112, True, "HS", 1),
        bneck_conf(112, 5, 672, 160 // reduce_divider, True, "HS", 2),  # C4
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
    ]
    last_channel = adjust_channels(1280 // reduce_divider)  # C5

    return MobileNetV3(inverted_residual_setting=inverted_residual_setting,
                       last_channel=last_channel,
                       num_classes=num_classes)


def mobilenet_v3_small(num_classes: int = 1000,
                       reduced_tail: bool = False) -> MobileNetV3:
    """
    Constructs a large MobileNetV3 architecture from
    "Searching for MobileNetV3" <https://arxiv.org/abs/1905.02244>.

    weights_link:
    https://download.pytorch.org/models/mobilenet_v3_small-047dcff4.pth

    Args:
        num_classes (int): number of classes
        reduced_tail (bool): If True, reduces the channel counts of all feature layers
            between C4 and C5 by 2. It is used to reduce the channel redundancy in the
            backbone for Detection and Segmentation.
    """
    width_multi = 1.0
    bneck_conf = partial(InvertedResidualConfig, width_multi=width_multi)
    adjust_channels = partial(InvertedResidualConfig.adjust_channels, width_multi=width_multi)

    reduce_divider = 2 if reduced_tail else 1

    inverted_residual_setting = [
        # input_c, kernel, expanded_c, out_c, use_se, activation, stride
        bneck_conf(16, 3, 16, 16, True, "RE", 2),  # C1
        bneck_conf(16, 3, 72, 24, False, "RE", 2),  # C2
        bneck_conf(24, 3, 88, 24, False, "RE", 1),
        bneck_conf(24, 5, 96, 40, True, "HS", 2),  # C3
        bneck_conf(40, 5, 240, 40, True, "HS", 1),
        bneck_conf(40, 5, 240, 40, True, "HS", 1),
        bneck_conf(40, 5, 120, 48, True, "HS", 1),
        bneck_conf(48, 5, 144, 48, True, "HS", 1),
        bneck_conf(48, 5, 288, 96 // reduce_divider, True, "HS", 2),  # C4
        bneck_conf(96 // reduce_divider, 5, 576 // reduce_divider, 96 // reduce_divider, True, "HS", 1),
        bneck_conf(96 // reduce_divider, 5, 576 // reduce_divider, 96 // reduce_divider, True, "HS", 1)
    ]
    last_channel = adjust_channels(1024 // reduce_divider)  # C5

    return MobileNetV3(inverted_residual_setting=inverted_residual_setting,
                       last_channel=last_channel,
                       num_classes=num_classes)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mfbz.cn/a/103713.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

iOS 配置通用链接(Universal Link)服务端和开发者后台都配置好了,还是跳转不到App

目录 一、什么是 Universal Link&#xff1f; 1.背景介绍 2.特点 3.运行机制原理&流程图 二、配置教程 1.第一步&#xff1a;开启 Associated Domains 服务 1.1 开通 Associated Domains 2.第二步&#xff1a;服务器配置 apple-app-site-association&#xff08;AAS…

RISC-V架构——中断处理和中断控制器介绍

1、ARM架构中断机制介绍 本文不是从零开始讲解中断&#xff0c;对于中断的基本知识不再赘述&#xff0c;对中断不是很了解可以先学习ARM中断的文章。参考博客&#xff1a;《ARM架构的外部中断介绍(S5PV210芯片)》&#xff1b; 2、RIAC_V架构的中断控制器架构 &#xff08;1&…

Qt之自定义事件QEvent

在Qt中,自定义事件的步骤大概如下: 1.创建自定义事件,自定义事件需要继承QEvent 2.使用QEvent::registerEventType()注册自定义事件类型,事件的类型需要在 QEvent::User 和 QEvent::MaxUser 范围之间,在QEvent::User之前是预留给系统的事件 3.使用sendEvent() 和 postEv…

【软件安装】Linux系统中安装MySQL数据库服务

这篇文章&#xff0c;主要介绍如何在Linux系统中安装MySQL数据库服务。 目录 一、Linux安装MySQL 1.1、下载MySQL安装包 1.2、解压MySQL安装包 1.3、更改存放目录 1.4、创建用户组和用户 1.5、创建数据目录data 1.6、创建my.cnf配置文件 1.7、初始化数据库 1.8、添加m…

C# 基于腾讯云人脸核身和百度云证件识别技术相结合的 API 实现

目录 腾讯云人脸核身技术 Craneoffice.net 采用的识别方式 1、活体人脸核身(权威库)&#xff1a; 2、活体人脸比对&#xff1a; 3、照片人脸核身(权威库)&#xff1a; 调用成本 百度云身份证识别 调用成本 相关结合点 核心代码 实现调用人脸核身API的示例 实现调用身…

微信小程序设置 wx.showModal 提示框中 确定和取消按钮的颜色

wx官方提供的 showModal 无疑是个非常优秀的选择提示工具 但是 我们还可以让他的颜色更贴近整体的小程序风格 cancelColor 可以改变取消按钮的颜色 confirmColor 则可以控制确定按钮的颜色 参考代码如下 wx.showModal({cancelColor: #0000FF,confirmColor: #45B250,content:…

行业追踪,2023-10-25

自动复盘 2023-10-25 凡所有相&#xff0c;皆是虚妄。若见诸相非相&#xff0c;即见如来。 k 线图是最好的老师&#xff0c;每天持续发布板块的rps排名&#xff0c;追踪板块&#xff0c;板块来开仓&#xff0c;板块去清仓&#xff0c;丢弃自以为是的想法&#xff0c;板块去留让…

如何构建一个外卖微信小程序

随着外卖行业的不断发展&#xff0c;越来越多的商家开始关注外卖微信小程序的开发。微信小程序具有使用方便、快速上线、用户覆盖广等优势&#xff0c;成为了商家们的首选。 那么&#xff0c;如何快速开发一个外卖微信小程序呢&#xff1f;下面就让我们来看看吧&#xff01; 首…

Python Opencv实践 - 入门使用Tesseract识别图片中的文字

做车牌识别项目前试一试tesseract识别中文。tesseract的安装使用请参考&#xff1a; Python OCR工具pytesseract详解 - 知乎pytesseract是基于Python的OCR工具&#xff0c; 底层使用的是Google的Tesseract-OCR 引擎&#xff0c;支持识别图片中的文字&#xff0c;支持jpeg, png…

面试题之Vue和React的区别是什么?

一提到前端框架&#xff0c;相信大家都对Vue和React不陌生&#xff0c;这两个前端框架都是比较主流的&#xff0c;用户也都比较多&#xff0c;但是我们在使用这些框架的时候&#xff0c;是否对这两个框架之间的区别有所了解呢&#xff1f;接下来&#xff0c;让我们来一起的系统…

RK3399平台开发中安卓系统去除USB权限弹窗

RK3399平台开发中安卓系统去除USB权限弹窗 问题方法 问题 当我们在访问一个插入到Android系统的USB设备的时候往往是需要权限的&#xff0c;此时系统会弹出询问权限的对话框&#xff0c;而我们此时希望让它默认允许访问USB设备并且不希望用户看到这个对话框。 方法 文件目录&…

Git不常用命令(持续更新)

今日鸡汤&#xff1a;当你最满足的时候&#xff0c;通常也最孤独&#xff1b;当你最愤慨的时候&#xff0c;通常也最可怜。 此博文会列出一些平时不常用&#xff0c;但是能提高效率的git命令&#xff0c;后续会出IDEA对应的操作步骤 快看看你是不是都用过... 分支&#xff08;…

蓝桥算法赛(铺地板)

问题描述 小蓝家要装修了&#xff0c;小蓝爸爸买来了很多块&#xff08;你可以理解为数量无限&#xff09; 23 规格的地砖&#xff0c;小蓝家的地板是 nm 规格的&#xff0c;小蓝想问你&#xff0c;能否用这些 23 的地砖铺满地板。 铺满地板&#xff1a;对于地板的每个区域&…

C语言-面试题实现有序序列合并

要求&#xff1a; a.输入两个升序排列的序列&#xff0c;将两个序列合并为一个有序序列并输出。 数据范围&#xff1a; 1≤n,m≤1000 1≤n,m≤1000 &#xff0c; 序列中的值满足 0≤val≤30000 输入描述&#xff1a; 1.输入包含三行&#xff0c; 2.第一行包含两个正整数n, m&am…

Kubernetes 的有状态和无状态服务

Author&#xff1a;rab 目录 前言一、无状态服务案例1.1 yml 案例1.2 扩容与缩容1.2.1 扩容1.2.2 缩容 1.3 暂停与恢复1.3.1 暂停1.3.2 恢复 1.4 回滚 二、有状态服务案例2.1 yml 案例2.2 扩容与缩容2.2.1 扩容2.2.2 缩容 总结 前言 在 Kubernetes&#xff08;k8s&#xff09;…

(Java)中的数据类型和变量

文章目录 一、字面常量二、数据类型三、变量1.变量的概念2.语法的格式3.整型变量4.长整型变量5.短整型变量6.字节型变量 四、浮点型变量1.双精度浮点数2.单精度浮点数 五、字符型常量六、布尔型变量七、类型转换1.自动类型转换&#xff08;隐式&#xff09;2.强制类型转换(显式…

SCT52240STDR双路 4A/4A 高速MOSFET/IGBT栅极驱动器, 可并联输出

SCT52240是是一款宽供电电压、双通道、高速、低测栅极驱动器&#xff0c;包括功率MOSFET&#xff0c;IGBT。单个通道能够提供高达4A拉电流和4A灌电流的轨到轨驱动能力&#xff0c;并实现轨到轨输出。高达24V宽电压范围提高功率器件开关瞬间栅极驱动的振铃幅值裕度。13ns输入输出…

Pytorch公共数据集、tensorboard、DataLoader使用

本文将主要介绍torchvision.datasets的使用&#xff0c;并以CIFAR-10为例进行介绍&#xff0c;对可视化工具tensorboard进行介绍&#xff0c;包括安装&#xff0c;使用&#xff0c;可视化过程等&#xff0c;最后介绍DataLoader的使用。希望对你有帮助 Pytorch公共数据集 torc…

如何理解某一个开发框架的意图,从而去写和落代码文件位置不会出错

理解一个开发框架的意图并正确地组织代码和文件位置是一个重要的技能。这不仅需要对框架本身有深入的理解&#xff0c;还需要对框架背后的设计哲学和最佳实践有所了解。以下是一些帮助你达到这一目标的步骤和建议&#xff1a; 1. 学习框架的文档 详细阅读: 开始使用任何框架前…

【数据结构】数组和字符串(四):特殊矩阵的压缩存储:稀疏矩阵——三元组表

文章目录 4.2.1 矩阵的数组表示4.2.2 特殊矩阵的压缩存储a. 对角矩阵的压缩存储b~c. 三角、对称矩阵的压缩存储d. 稀疏矩阵的压缩存储——三元组表结构体初始化元素设置打印矩阵主函数输出结果代码整合 4.2.1 矩阵的数组表示 【数据结构】数组和字符串&#xff08;一&#xff…