20240127在ubuntu20.04.6下配置whisper

20240131在ubuntu20.04.6下配置whisper
2024/1/31 15:48


首先你要有一张NVIDIA的显卡,比如我用的PDD拼多多的二手GTX1080显卡。【并且极其可能是矿卡!】800¥
2、请正确安装好NVIDIA最新的驱动程序和CUDA。可选安装!
3、配置whisper

rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ python -m pip install --upgrade pip
【可以不安装conda】
rootroot@rootroot-X99-Turbo:~$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
rootroot@rootroot-X99-Turbo:~$ ffmpeg
rootroot@rootroot-X99-Turbo:~$ pip install -U openai-whisper
rootroot@rootroot-X99-Turbo:~$ pip install tiktoken
rootroot@rootroot-X99-Turbo:~$ pip install setuptools-rust
rootroot@rootroot-X99-Turbo:~$ whisper audio.mp3 --model medium --language Chinese
rootroot@rootroot-X99-Turbo:~$ whisper chi.mp4 --model medium --language Chinese
rootroot@rootroot-X99-Turbo:~$ sudo apt-get install ffmpeg
rootroot@rootroot-X99-Turbo:~$ time(whisper chs.mp4 --model medium --language Chinese)

rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ python -m pip install --upgrade pip
Collecting pip
  Downloading pip-23.3.2-py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 690 kB/s 
Installing collected packages: pip
Successfully installed pip-23.3.2
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ sudo mkdir /opt/tools
rootroot@rootroot-X99-Turbo:~$ cd /opt/tools/
rootroot@rootroot-X99-Turbo:/opt/tools$ 
rootroot@rootroot-X99-Turbo:/opt/tools$ ll
total 8
drwxr-xr-x 2 root root 4096 1月  26 12:21 ./
drwxr-xr-x 4 root root 4096 1月  26 12:21 ../
rootroot@rootroot-X99-Turbo:/opt/tools$ 
rootroot@rootroot-X99-Turbo:/opt/tools$ cd ~
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
--2024-01-26 12:22:28--  https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8203, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 141613749 (135M) [application/octet-stream]
Saving to: ‘Miniconda3-latest-Linux-x86_64.sh’

Miniconda3-latest-Linux-x86_64.sh            100%[=============================================================================================>] 135.05M  2.82MB/s    in 51s     

2024-01-26 12:23:20 (2.65 MB/s) - ‘Miniconda3-latest-Linux-x86_64.sh’ saved [141613749/141613749]

rootroot@rootroot-X99-Turbo:~$ ffmpeg
ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Hyper fast Audio and Video encoder
usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...

Use -h to get full help or, even better, run 'man ffmpeg'
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ pip install -U openai-whisper
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: openai-whisper in ./.local/lib/python3.8/site-packages (20231117)
Requirement already satisfied: triton<3,>=2.0.0 in ./.local/lib/python3.8/site-packages (from openai-whisper) (2.2.0)
Requirement already satisfied: numba in ./.local/lib/python3.8/site-packages (from openai-whisper) (0.58.1)
Requirement already satisfied: numpy in ./.local/lib/python3.8/site-packages (from openai-whisper) (1.24.4)
Requirement already satisfied: torch in ./.local/lib/python3.8/site-packages (from openai-whisper) (2.1.2)
Requirement already satisfied: tqdm in ./.local/lib/python3.8/site-packages (from openai-whisper) (4.66.1)
Requirement already satisfied: more-itertools in ./.local/lib/python3.8/site-packages (from openai-whisper) (10.2.0)
Requirement already satisfied: tiktoken in ./.local/lib/python3.8/site-packages (from openai-whisper) (0.5.2)
Requirement already satisfied: filelock in ./.local/lib/python3.8/site-packages (from triton<3,>=2.0.0->openai-whisper) (3.13.1)
Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in ./.local/lib/python3.8/site-packages (from numba->openai-whisper) (0.41.1)
Requirement already satisfied: importlib-metadata in ./.local/lib/python3.8/site-packages (from numba->openai-whisper) (7.0.1)
Requirement already satisfied: regex>=2022.1.18 in ./.local/lib/python3.8/site-packages (from tiktoken->openai-whisper) (2023.12.25)
Requirement already satisfied: requests>=2.26.0 in ./.local/lib/python3.8/site-packages (from tiktoken->openai-whisper) (2.31.0)
Requirement already satisfied: typing-extensions in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (4.9.0)
Requirement already satisfied: sympy in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (1.12)
Requirement already satisfied: networkx in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (3.1)
Requirement already satisfied: jinja2 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (3.1.3)
Requirement already satisfied: fsspec in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (2023.12.2)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)
Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (8.9.2.26)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.3.1)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (11.0.2.54)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (10.3.2.106)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (11.4.5.107)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.0.106)
Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (2.18.1)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)
Collecting triton<3,>=2.0.0 (from openai-whisper)
  Downloading triton-2.1.0-0-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.3 kB)
Requirement already satisfied: nvidia-nvjitlink-cu12 in ./.local/lib/python3.8/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch->openai-whisper) (12.3.101)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.local/lib/python3.8/site-packages (from requests>=2.26.0->tiktoken->openai-whisper) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken->openai-whisper) (2.8)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken->openai-whisper) (1.25.8)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken->openai-whisper) (2019.11.28)
Requirement already satisfied: zipp>=0.5 in ./.local/lib/python3.8/site-packages (from importlib-metadata->numba->openai-whisper) (3.17.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./.local/lib/python3.8/site-packages (from jinja2->torch->openai-whisper) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in ./.local/lib/python3.8/site-packages (from sympy->torch->openai-whisper) (1.3.0)
Downloading triton-2.1.0-0-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.2/89.2 MB 25.9 MB/s eta 0:00:00
Installing collected packages: triton
  Attempting uninstall: triton
    Found existing installation: triton 2.2.0
    Uninstalling triton-2.2.0:
      Successfully uninstalled triton-2.2.0
Successfully installed triton-2.1.0
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ pip install tiktoken
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: tiktoken in ./.local/lib/python3.8/site-packages (0.5.2)
Requirement already satisfied: regex>=2022.1.18 in ./.local/lib/python3.8/site-packages (from tiktoken) (2023.12.25)
Requirement already satisfied: requests>=2.26.0 in ./.local/lib/python3.8/site-packages (from tiktoken) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.local/lib/python3.8/site-packages (from requests>=2.26.0->tiktoken) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken) (2.8)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken) (1.25.8)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken) (2019.11.28)
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ pip install setuptools-rust
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: setuptools-rust in ./.local/lib/python3.8/site-packages (1.8.1)
Requirement already satisfied: setuptools>=62.4 in ./.local/lib/python3.8/site-packages (from setuptools-rust) (69.0.3)
Requirement already satisfied: semantic-version<3,>=2.8.2 in ./.local/lib/python3.8/site-packages (from setuptools-rust) (2.10.0)
Requirement already satisfied: tomli>=1.2.1 in ./.local/lib/python3.8/site-packages (from setuptools-rust) (2.0.1)
rootroot@rootroot-X99-Turbo:~$ sudo apt update && sudo apt install ffmpeg
Get:1 file:/var/cuda-repo-ubuntu2004-12-0-local  InRelease [1,575 B]
Get:2 file:/var/cuda-repo-ubuntu2004-12-3-local  InRelease [1,572 B]
Get:1 file:/var/cuda-repo-ubuntu2004-12-0-local  InRelease [1,575 B]                                                  
Get:2 file:/var/cuda-repo-ubuntu2004-12-3-local  InRelease [1,572 B]                                                  
Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal InRelease                                                                       
Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal-updates InRelease                         
Hit:5 http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal-backports InRelease                       
Hit:6 http://security.ubuntu.com/ubuntu focal-security InRelease               
Hit:7 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu focal InRelease     
Reading package lists... Done
Building dependency tree       
Reading state information... Done
30 packages can be upgraded. Run 'apt list --upgradable' to see them.
Reading package lists... Done
Building dependency tree       
Reading state information... Done
ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1).
0 upgraded, 0 newly installed, 0 to remove and 30 not upgraded.
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ whisper audio.mp3 --model medium --language Chinese
100%|█████████████████████████████████████| 1.42G/1.42G [03:24<00:00, 7.48MiB/s]
Traceback (most recent call last):
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 58, in load_audio
    out = run(cmd, capture_output=True, check=True).stdout
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', 'audio.mp3', '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 478, in cli
    result = transcribe(model, audio_path, temperature=temperature, **args)
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 122, in transcribe
    mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 140, in log_mel_spectrogram
    audio = load_audio(audio)
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 60, in load_audio
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
audio.mp3: No such file or directory

Skipping audio.mp3 due to RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
audio.mp3: No such file or directory

rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ whisper chi.mp4 --model medium --language Chinese
Traceback (most recent call last):
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 58, in load_audio
    out = run(cmd, capture_output=True, check=True).stdout
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', 'chi.mp4', '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 478, in cli
    result = transcribe(model, audio_path, temperature=temperature, **args)
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 122, in transcribe
    mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 140, in log_mel_spectrogram
    audio = load_audio(audio)
  File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 60, in load_audio
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
chi.mp4: No such file or directory

Skipping chi.mp4 due to RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
chi.mp4: No such file or directory

rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ sudo apt-get install ffmpeg
Reading package lists... Done
Building dependency tree       
Reading state information... Done
ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1).
0 upgraded, 0 newly installed, 0 to remove and 30 not upgraded.
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ ll *.mp4
-rwx------ 1 rootroot rootroot 3465644 1月  12 01:28 chs.mp4*
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ whisper chs.mp4 --model medium --language Chinese
[00:00.000 --> 00:01.400] 前段時間有個巨石鴻吼
[00:01.400 --> 00:03.000] 某某是男人最好的衣妹
[00:03.000 --> 00:04.800] 這裡的某某可以替換為減肥
[00:04.800 --> 00:07.800] 長髮 西裝 考研 術唱 永潔無間等等等等
[00:07.800 --> 00:09.200] 我聽到最新的一個說法是
[00:09.200 --> 00:12.000] 微分碎蓋加口罩加半框眼鏡加春風衣
[00:12.000 --> 00:13.400] 等於男人最好的衣妹
[00:13.400 --> 00:14.400] 大概也就前幾年
[00:14.400 --> 00:17.400] 春風衣還和格子襯衫並列為程序員穿搭精華
[00:17.400 --> 00:20.000] 紫紅色春風衣還被譽為廣場舞大媽標配
[00:20.000 --> 00:21.600] 路透牌還是我爹這個年紀的人
[00:21.600 --> 00:22.800] 才會願意買的牌子
[00:22.800 --> 00:24.400] 不知道風向為啥變得這麼快
[00:24.400 --> 00:26.800] 為啥這東西突然變成男生逆襲神器
[00:26.800 --> 00:27.800] 時尚潮流單品
[00:27.800 --> 00:29.400] 後來我翻了一下小紅書就懂了
[00:29.400 --> 00:30.400] 時尚這個時期
[00:30.400 --> 00:31.600] 重點不在於衣服
[00:31.600 --> 00:32.200] 在於人
[00:32.200 --> 00:34.600] 先在小紅書上面和春風衣相關的筆記
[00:34.600 --> 00:36.200] 照片裡的男生都是這樣的
[00:36.200 --> 00:37.000] 這樣的
[00:37.000 --> 00:38.000] 還有這樣的
[00:38.000 --> 00:39.400] 你們哪裡是看穿搭的
[00:39.400 --> 00:40.600] 你們明明是看臉
[00:40.600 --> 00:41.800] 就這個造型 這個年齡
[00:41.800 --> 00:44.000] 你換上老頭衫也能穿出氛圍感好嗎
[00:44.000 --> 00:46.600] 我又想起了當年郭德綱老師穿季凡西的殘劇
[00:46.600 --> 00:48.600] 這個世界對我們這些長得不好看的人
[00:48.600 --> 00:49.600] 還真是苛刻的
[00:49.600 --> 00:52.000] 所以說我總結了一下春風衣傳達的要領
[00:52.200 --> 00:54.400] 大概就是一張白鏡且人畜無憾的臉
[00:54.400 --> 00:55.200] 充足的髮量
[00:55.200 --> 00:56.200] 纖細的體型
[00:56.200 --> 00:58.200] 當然身上的春風衣還得是駱駝的
[00:58.200 --> 00:59.400] 去年在戶外用品界
[00:59.400 --> 01:00.200] 最頂流的
[01:00.200 --> 01:01.200] 既不是鳥橡樹
[01:01.200 --> 01:02.800] 也不是有校服之稱的北面
[01:02.800 --> 01:04.200] 或者老臺頂流哥倫比亞
[01:04.200 --> 01:05.000] 而是駱駝
[01:05.000 --> 01:07.200] 雙11 駱駝在天貓戶外服飾品類
[01:07.200 --> 01:09.000] 拿下銷售額和銷量雙料冠軍
[01:09.000 --> 01:10.200] 銷量達到百萬幾
[01:10.200 --> 01:10.800] 再抖音
[01:10.800 --> 01:13.400] 駱駝銷售同比增幅高達296%
[01:13.400 --> 01:16.200] 旗下主打的三合一高性價比春風衣成為爆品
[01:22.600 --> 01:23.200] 至於線下
[01:23.200 --> 01:24.400] 還是網友總覺得好
[01:24.400 --> 01:26.800] 如今在南方街頭的駱駝比沙漠裡的都多
[01:30.000 --> 01:31.200] 至於駱駝為啥這麼火
[01:31.200 --> 01:32.000] 便宜啊
[01:32.000 --> 01:33.600] 拿賣得最好的丁珍同款
[01:33.600 --> 01:35.600] 幻影黑三合一春風衣舉個例子
[01:35.600 --> 01:36.000] 線下買
[01:36.000 --> 01:37.600] 標牌價格2198
[01:37.600 --> 01:39.200] 但是跑到網上看一下
[01:39.200 --> 01:40.800] 標價就變成了699
[01:40.800 --> 01:41.400] 至於折扣
[01:41.400 --> 01:42.400] 日常也都是有的
[01:42.400 --> 01:43.600] 400出頭就能買到
[01:43.600 --> 01:45.200] 甚至有時候能递到300價
[01:45.200 --> 01:46.200] 要是你還嫌貴
[01:46.200 --> 01:48.400] 駱駝還有200塊出頭的單層春風衣
[01:48.400 --> 01:49.200] 就這個價格
[01:49.200 --> 01:51.800] 哥上海恐怕還不夠兩次City Walk的報名費
[01:51.800 --> 01:52.600] 看來這個價格
[01:52.600 --> 01:54.800] 再對比一下北面1000塊錢起步
[01:54.800 --> 01:56.000] 你就能理解為啥北面
[01:56.000 --> 01:58.200] 這麼快就被大學生踢出了校服序列了
[01:58.200 --> 02:00.400] 我不知道現在大學生每個月生活費多少
[02:00.400 --> 02:02.200] 反正按照我上學時候的生活費
[02:02.200 --> 02:03.200] 一個月不吃不喝
[02:03.200 --> 02:05.000] 也就買得起倆袖子加一個帽子
[02:05.000 --> 02:06.400] 難怪當年全是假北面
[02:06.400 --> 02:07.400] 現在都是真駱駝
[02:07.400 --> 02:08.800] 至少人家是正品啊
[02:08.800 --> 02:10.000] 我翻了一下社交媒體
[02:10.000 --> 02:11.200] 發現對駱駝的吐槽
[02:11.200 --> 02:12.000] 和買了駱駝的
[02:12.000 --> 02:13.400] 基本上是1比1的比例
[02:13.400 --> 02:15.000] 吐槽最多的就是衣服會掉色
[02:15.000 --> 02:15.800] 還會串色
[02:15.800 --> 02:17.000] 比如圖層洗個幾次
[02:17.000 --> 02:18.200] 穿個兩天就掉光了
[02:18.200 --> 02:19.600] 比如不同倉庫發的貨
[02:19.600 --> 02:20.600] 質量參差不齊
[02:20.600 --> 02:21.600] 買衣服還得看戶口
[02:21.600 --> 02:22.400] 聽出聲
[02:22.400 --> 02:23.600] 至於什麼做工比較差
[02:23.600 --> 02:24.800] 內膽多 走線操
[02:24.800 --> 02:26.400] 不防水之類的就更多了
[02:26.400 --> 02:27.400] 但是這些吐槽
[02:27.400 --> 02:29.200] 並不意味著會影響駱駝的銷量
[02:29.200 --> 02:30.800] 甚至還會有不少自來水表示
[02:30.800 --> 02:32.600] 就這價格要啥子行車啊
[02:32.600 --> 02:34.000] 所謂性價比性價比
[02:34.000 --> 02:35.200] 脫離價位談性能
[02:35.200 --> 02:37.000] 這就不符合消費者的需求嘛
[02:37.000 --> 02:38.400] 無數次價格戰告訴我們
[02:38.400 --> 02:39.400] 只要肯降價
[02:39.400 --> 02:41.000] 就沒有賣不出去的產品
[02:41.000 --> 02:42.400] 一件衝鋒衣1000多
[02:42.400 --> 02:43.600] 你覺得平平無奇
[02:43.600 --> 02:45.000] 500多你覺得差點意思
[02:45.000 --> 02:46.400] 200塊你就秒下單了
[02:46.400 --> 02:47.000] 到99
[02:47.000 --> 02:48.400] 恐怕就要拼點手速了
[02:48.400 --> 02:49.600] 像衝鋒衣這個品類
[02:49.600 --> 02:50.800] 本來價格跨度就大
[02:50.800 --> 02:52.800] 北面最便宜的GORTEX衝鋒衣
[02:52.800 --> 02:53.800] 價格3000起步
[02:53.800 --> 02:55.200] 大概是同品牌最便宜
[02:55.200 --> 02:56.200] 衝鋒衣的三倍價格
[02:56.200 --> 02:57.200] 至於十足那樣
[02:57.200 --> 02:59.000] 搭載了GORTEX的硬殼起步價
[02:59.000 --> 03:00.000] 就要到4500
[03:00.000 --> 03:01.200] 而且同樣是GORTEX
[03:01.200 --> 03:02.800] 內部也有不同的系列和檔次
[03:02.800 --> 03:03.600] 做成衣服
[03:03.600 --> 03:05.600] 中間的差價恐怕就夠買兩件駱駝了
[03:05.600 --> 03:06.600] 至於智能控溫
[03:06.600 --> 03:07.400] 防水拉鍊
[03:07.400 --> 03:08.000] 全壓膠
[03:08.000 --> 03:09.800] 更加不可能出現在駱駝這裡了
[03:09.800 --> 03:11.800] 至少不會是300 400的駱駝身上會有的
[03:11.800 --> 03:12.800] 有的價外的衣服
[03:12.800 --> 03:14.200] 買的就是一個放棄幻想
[03:14.200 --> 03:15.800] 吃到肚子裡的科技魚很活
[03:15.800 --> 03:17.000] 是能給你省錢的
[03:17.000 --> 03:18.400] 穿在身上的科技魚很活
[03:18.400 --> 03:20.000] 裝裝件件都是要加錢的
[03:20.000 --> 03:21.600] 所以正如羅曼羅蘭所說
[03:21.600 --> 03:23.200] 這世界上只有一種英雄主義
[03:23.200 --> 03:24.800] 就是在認清了駱駝的本質以後
[03:24.800 --> 03:26.000] 依然選擇買駱駝
[03:26.000 --> 03:27.000] 關於駱駝的火爆
[03:27.000 --> 03:28.200] 我有一些小小的看法
[03:28.200 --> 03:29.000] 駱駝這個東西
[03:29.000 --> 03:30.400] 它其實就是個潮牌
[03:30.400 --> 03:32.000] 看看它的營銷方式就知道了
[03:32.000 --> 03:33.000] 現在打開小黃書
[03:33.000 --> 03:35.000] 日常可以看到駱駝穿搭是這樣的
[03:35.000 --> 03:36.600] 加一點氛圍感是這樣的
[03:36.600 --> 03:37.400] 對比一下
[03:37.400 --> 03:39.000] 其他品牌的風格是這樣的
[03:39.000 --> 03:39.800] 這樣的
[03:39.800 --> 03:41.200] 其實對比一下就知道了
[03:41.200 --> 03:42.600] 其他品牌突出一個時程
[03:42.600 --> 03:44.200] 能防風就一定要講防風
[03:44.200 --> 03:46.000] 能扛動就一定要講扛動
[03:46.000 --> 03:47.400] 但駱駝在營銷的時候
[03:47.400 --> 03:49.200] 主打的就是一個城市戶外風
[03:49.200 --> 03:50.400] 雖然造型是春風衣
[03:50.400 --> 03:52.200] 但場景往往是在城市裡
[03:52.200 --> 03:54.200] 哪怕在野外也要突出一個風和日麗
[03:54.200 --> 03:55.000] 陽光美媚
[03:55.000 --> 03:56.400] 至少不會在明顯的嚴寒
[03:56.400 --> 03:58.000] 高海拔或是惡劣氣候下
[03:58.200 --> 04:00.200] 如果用一個詞形容駱駝的營銷風格
[04:00.200 --> 04:01.000] 那就是清洗
[04:01.000 --> 04:03.000] 或者說他很理解自己的消費者是誰
[04:03.000 --> 04:04.000] 需要什麼產品
[04:04.000 --> 04:05.200] 從使用場景來說
[04:05.200 --> 04:06.600] 駱駝的消費者買春風衣
[04:06.600 --> 04:08.800] 不是真的有什麼大風大雨要去應對
[04:08.800 --> 04:11.000] 春風衣的作用是下雨沒帶傘的時候
[04:11.000 --> 04:12.000] 臨時頂個幾分鐘
[04:12.000 --> 04:13.600] 讓你能圖書館跑回宿舍
[04:13.600 --> 04:15.000] 或者是冬天騎電動車
[04:15.000 --> 04:16.200] 被風吹得不行的時候
[04:16.200 --> 04:17.200] 稍微扛一下風
[04:17.200 --> 04:18.400] 不至於體感太冷
[04:18.400 --> 04:19.800] 當然他們也會出門
[04:19.800 --> 04:21.800] 但大部分時候也都是去別的城市
[04:21.800 --> 04:24.000] 或者在城市周邊搞搞簡單的徒步
[04:24.000 --> 04:26.000] 這種情況下穿個駱駝已經夠了
[04:26.000 --> 04:27.200] 從購買動機來說
[04:27.200 --> 04:29.200] 駱駝就更沒有必要上那些應回科技了
[04:29.200 --> 04:31.000] 消費者買駱駝買的是個什麼呢
[04:31.000 --> 04:32.200] 不是春風衣的功能性
[04:32.200 --> 04:33.400] 而是春風衣的造型
[04:33.400 --> 04:34.400] 寬鬆的版型
[04:34.400 --> 04:36.400] 能精準遮住微微隆起的小肚子
[04:36.400 --> 04:37.400] 棱角分明的質感
[04:37.400 --> 04:39.400] 能隱藏一切不完美的身體線條
[04:39.400 --> 04:41.400] 顯瘦的副作用就是顯年輕
[04:41.400 --> 04:42.600] 再配上一條牛仔褲
[04:42.600 --> 04:43.800] 配上一雙大黃靴
[04:43.800 --> 04:45.200] 大學生的氣質就出來了
[04:45.200 --> 04:46.200] 要是自拍的時候
[04:46.200 --> 04:47.800] 再配上大學宿舍洗素臺
[04:47.800 --> 04:49.200] 那永遠擦不乾淨的鏡子
[04:49.200 --> 04:50.600] 瞬間青春無敵了
[04:50.800 --> 04:51.800] 說的更直白一點
[04:51.800 --> 04:53.200] 人家買的是個簡靈神器
[04:53.200 --> 04:53.800] 所以說
[04:53.800 --> 04:56.000] 吐槽穿駱駝都是假戶外愛好者的人
[04:56.000 --> 04:57.600] 其實並沒有理解駱駝的定位
[04:57.600 --> 04:59.800] 駱駝其實是給了想要入門山系穿搭
[04:59.800 --> 05:01.800] 想要追逐流行的人一個最平價
[05:01.800 --> 05:03.000] 決策成本最低的選擇
[05:03.000 --> 05:04.800] 至於那些真正的硬核戶外愛好者
[05:04.800 --> 05:05.800] 駱駝既沒有能力
[05:05.800 --> 05:07.200] 也沒有打算觸打他們
[05:07.200 --> 05:08.000] 反過來說
[05:08.000 --> 05:09.600] 那些自駕穿越邊疆國道
[05:09.600 --> 05:11.800] 或者去奧爾卑斯山區登山探險的人
[05:11.800 --> 05:13.600] 也不太可能在戶外服飾上省錢
[05:13.600 --> 05:15.000] 畢竟光是交通住宿
[05:15.400 --> 05:16.400] 成本就不低了
[05:16.400 --> 05:17.200] 對他們來說
[05:17.200 --> 05:19.000] 戶外裝備很多時候是保命用的
[05:19.000 --> 05:21.000] 也就不存在跟風奧造型的必要了
[05:21.000 --> 05:22.200] 最後我再說個題外話
[05:22.200 --> 05:24.200] 年輕人追捧駱駝一個隱藏的原因
[05:24.200 --> 05:25.800] 其實是羽絨服越來越貴了
[05:25.800 --> 05:26.600] 有媒體統計
[05:26.600 --> 05:30.000] 現在國產羽絨服的平均售價已經高達881元
[05:30.000 --> 05:32.000] 波斯登均價最高接近2000元
[05:32.000 --> 05:32.800] 而且過去幾年
[05:32.800 --> 05:34.800] 國產羽絨服品牌都在轉向高端化
[05:34.800 --> 05:37.000] 羽絨服市場分為8000元以上的奢侈級
[05:37.000 --> 05:38.400] 2000元以下的大眾級
[05:38.400 --> 05:39.800] 而在中間的高端級
[05:39.800 --> 05:41.200] 國產品牌一直沒有存在感
[05:41.200 --> 05:42.200] 所以過去幾年
[05:42.200 --> 05:43.600] 波斯登天工人這些品牌
[05:43.600 --> 05:45.200] 都把2000元到8000元這個市場
[05:45.200 --> 05:46.600] 當成未來的發展趨勢
[05:46.600 --> 05:48.000] 東新證券研報顯示
[05:48.000 --> 05:49.600] 從2018到2021年
[05:49.600 --> 05:52.200] 波斯登均價4年漲幅達到60%以上
[05:52.200 --> 05:53.200] 過去5個菜年
[05:53.200 --> 05:55.000] 這個品牌的營銷開支從20多億
[05:55.000 --> 05:56.000] 漲到了60多億
[05:56.000 --> 05:57.200] 羽絨服價格往上走
[05:57.200 --> 05:59.200] 年輕消費者就開始拋棄羽絨服
[05:59.200 --> 06:00.400] 購買平價衝鋒衣
[06:00.400 --> 06:02.200] 裡面再穿個普通價外的瑤麗絨
[06:02.200 --> 06:03.400] 或者羽絨小夾克
[06:03.400 --> 06:05.200] 也不比大幾千的羽絨服差多少
[06:05.200 --> 06:05.800] 說到底
[06:05.800 --> 06:07.000] 現在消費社會發達了
[06:07.000 --> 06:08.000] 沒有什麼需求是
[06:08.000 --> 06:09.600] 一定要某種特定的解決方案
[06:09.600 --> 06:11.600] 特定價位的商品才能實現的
[06:11.600 --> 06:12.200] 要保暖
[06:12.200 --> 06:13.200] 羽絨服固然很好
[06:13.200 --> 06:15.200] 但衝鋒衣加一些內搭也很暖和
[06:15.200 --> 06:16.000] 要時尚
[06:16.000 --> 06:18.000] 大幾千塊錢的設計師品牌非常不錯
[06:18.000 --> 06:19.400] 但350的拼多多服飾
[06:19.400 --> 06:20.600] 搭得好也能出彩
[06:20.600 --> 06:21.600] 要去野外徒步
[06:21.600 --> 06:23.000] 花五六千買鳥也可以
[06:23.000 --> 06:25.200] 但迪卡農也足以應付大多數狀況
[06:25.200 --> 06:25.800] 所以說
[06:25.800 --> 06:27.600] 花高價買衝鋒衣當然也OK
[06:27.600 --> 06:28.600] 三四百買件駱駝
[06:28.600 --> 06:29.800] 也是可以接受的選擇
[06:29.800 --> 06:32.000] 何況駱駝也多多少少有一些功能性
[06:32.000 --> 06:33.800] 畢竟它再怎麼樣還是個衝鋒衣
[06:33.800 --> 06:34.800] 理解了這個事情
[06:34.800 --> 06:36.800] 就很容易分辨什麼是智商稅的
[06:36.800 --> 06:38.800] 那些向你灌輸非某個品牌不用
[06:38.800 --> 06:39.800] 告訴你某個需求
[06:39.800 --> 06:41.400] 只有某個產品才能滿足
[06:41.400 --> 06:42.200] 某個品牌
[06:42.200 --> 06:44.400] 就是某個品牌絕對的比試鏈頂端
[06:44.400 --> 06:46.800] 這類銀銷的智商稅含量必然是很高的
[06:46.800 --> 06:48.800] 它的目的是剝奪你選擇的權利
[06:48.800 --> 06:51.200] 讓你主動放棄比價和尋找平梯的想法
[06:51.200 --> 06:53.000] 從而避免與其他品牌競爭
[06:53.000 --> 06:54.200] 而沒有競爭的市場
[06:54.200 --> 06:56.200] 才是智商稅含量最高的市場
[06:56.200 --> 06:57.400] 消費商業洞穴
[06:57.400 --> 06:58.400] 禁在IC實驗室
[06:58.400 --> 06:59.000] 我是館長
[06:59.000 --> 07:00.000] 我們下期再見
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ 
rootroot@rootroot-X99-Turbo:~$ time(whisper chs.mp4 --model medium --language Chinese)

https://www.toutiao.com/article/7189209812264075835/?app=news_article&timestamp=1706203570&use_new_style=1&req_id=20240126012609901ACEF7F5666533AA21&group_id=7189209812264075835&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=5e0cda89-00c5-40fe-afa0-c3c88dd056c4&source=m_redirect
已达到人类水准语音识别模型的whisper,真的有这么厉害吗?

transcribe函数的language目前支持99种语言,如下:

"en": "english","zh": "chinese",
"de": "german","es": "spanish",
"ru": "russian","ko": "korean",
"fr": "french","ja": "japanese",
"pt": "portuguese","tr": "turkish",
"pl": "polish","ca": "catalan",
"nl": "dutch","ar": "arabic",
"sv": "swedish","it": "italian",
"id": "indonesian","hi": "hindi",
"fi": "finnish","vi": "vietnamese",
"he": "hebrew","uk": "ukrainian",
"el": "greek","ms": "malay",
"cs": "czech","ro": "romanian",
"da": "danish","hu": "hungarian",
"ta": "tamil","no": "norwegian",
"th": "thai","ur": "urdu",
"hr": "croatian","bg": "bulgarian",
"lt": "lithuanian","la": "latin",
"mi": "maori","ml": "malayalam",
"cy": "welsh","sk": "slovak",
"te": "telugu","fa": "persian",
"lv": "latvian","bn": "bengali",
"sr": "serbian","az": "azerbaijani",
"sl": "slovenian","kn": "kannada",
"et": "estonian","mk": "macedonian",
"br": "breton","eu": "basque",
"is": "icelandic","hy": "armenian",
"ne": "nepali","mn": "mongolian",
"bs": "bosnian","kk": "kazakh",
"sq": "albanian","sw": "swahili",
"gl": "galician","mr": "marathi",
"pa": "punjabi","si": "sinhala",
"km": "khmer","sn": "shona",
"yo": "yoruba","so": "somali",
"af": "afrikaans","oc": "occitan",
"ka": "georgian","be": "belarusian",
"tg": "tajik","sd": "sindhi",
"gu": "gujarati","am": "amharic",
"yi": "yiddish","lo": "lao",
"uz": "uzbek","fo": "faroese",
"ht": "haitian creole","ps": "pashto",
"tk": "turkmen","nn": "nynorsk",
"mt": "maltese","sa": "sanskrit",
"lb": "luxembourgish","my": "myanmar",
"bo": "tibetan","tl": "tagalog",
"mg": "malagasy","as": "assamese",
"tt": "tatar","haw": "hawaiian",
"ln": "lingala","ha": "hausa",
"ba": "bashkir","jw": "javanese","su": "sundanese",
官方还提供了另外一种调用方案:

import whisper
model = whisper.load_model("base")
# load audio and pad/trim it to fit 30 seconds
audio = whisper.load_audio("audio.mp3")
audio = whisper.pad_or_trim(audio)
# make log-Mel spectrogram and move to the same device as the model
mel = whisper.log_mel_spectrogram(audio).to(model.device)
# detect the spoken language
_, probs = model.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")
# decode the audio
options = whisper.DecodingOptions(language='Chinese')
result = whisper.decode(model, mel, options)
# print the recognized text
print(result.text)

参考资料:
https://www.toutiao.com/article/7229151806801248807/?app=news_article&timestamp=1706203733&use_new_style=1&req_id=20240126012853D9D3D4539BEF1333DBCC&group_id=7229151806801248807&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=085ce76c-b23a-4609-b2d0-d18c8d7ab8f8&source=m_redirect
C++版本人工智能实时语音转文字(字幕/语音识别)Whisper.cpp实践


【WINDOWS,大模型需要10GB】
https://blog.csdn.net/hhy321/article/details/134897967?spm=1001.2101.3001.6650.2&utm_medium=distribute.wap_relevant.none-task-blog-2~default~CTRLIST~Rate-2-134897967-blog-130001848.237%5Ev3%5Ewap_relevant_t0_download&depth_1-utm_source=distribute.wap_relevant.none-task-blog-2~default~CTRLIST~Rate-2-134897967-blog-130001848.237%5Ev3%5Ewap_relevant_t0_download&share_token=845e69c5-c625-4834-8faa-08f1f29f55b2
【小沐学Python】Python实现语音识别(Whisper)


https://blog.csdn.net/xkukeer/article/details/130227944?share_token=f48bfb40-9399-4375-894e-3ecf96d1c51d
openai的whisper语音识别介绍

第三步,选择使用的模型。
官方说有5种模型,其中4种是English-only模型,但是实测english-only也可以支持中文(只测了base可以支持中文,其他的没测但应该也可以)
虽说支持中文,但是也有不理想的地方,中文的识别错误率(WER (Word Error Rate))还不低,在所有支持语言的大概排中游水平。

第四步,具体使用
有好几种方法:
1、命令行模式
whisper audio.flac audio.mp3 audio.wav --model medium

对于非英文语言,加上–language参数,例如日语
whisper japanese.wav --language Japanese

支持的语言类型还挺多的


【WINDOWS】
https://blog.csdn.net/liaoqingjian/article/details/132474687?share_token=e6ad6f74-2fab-45c5-bdb5-40b48fe2cd79
whisper 语音识别项目部署


https://www.toutiao.com/article/7327918175801164325/?app=news_article&timestamp=1706203446&use_new_style=1&req_id=202401260124058D2D3B0452AC9B3435B3&group_id=7327918175801164325&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=ad4cdc74-1590-4a7b-b020-14f9186f9ef2&source=m_redirect
Whisper对于中文语音识别与转写中文文本优化的实践(Python3.10)


【WINDOWS】
https://www.toutiao.com/article/7276749520275456572/?app=news_article&timestamp=1706203504&use_new_style=1&req_id=2024012601250342BCD0F3D434AA335380&group_id=7276749520275456572&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=5bc13cbe-db1d-4883-bff4-b01f258dd1c2&source=m_redirect
语音转文字软件Whisper,实时自动语音识别,音频视频文案提取


 

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mfbz.cn/a/360442.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

备战蓝桥杯---数据结构与STL应用(入门3)

我们先来一道题作为过渡&#xff1a; 我们只需枚举n,选出左右第一个小于它高度的坐标即可&#xff0c;于是我们可以用两个方向的优先队列来维护&#xff0c;下面是AC代码&#xff1a; #include<bits/stdc.h> using namespace std; #define int long long int n; struct …

基于ssm和微信小程序的健身房私教预约管理系统

文章目录 项目介绍主要功能截图&#xff1a;部分代码展示设计总结项目获取方式 &#x1f345; 作者主页&#xff1a;超级无敌暴龙战士塔塔开 &#x1f345; 简介&#xff1a;Java领域优质创作者&#x1f3c6;、 简历模板、学习资料、面试题库【关注我&#xff0c;都给你】 &…

Word莫名其妙开启兼容模式将其永久取消的方法

这是因为Word模板文件被意外更改了 找到Word模板文件&#xff0c;目录在C:\Users\15976\AppData\Roaming\Microsoft\Templates 15976替换成你自己的用户名&#xff0c;不确定的就先点进C/Users看一看&#xff0c; AppData是隐藏文件夹&#xff0c;显示隐藏文件夹才能看见&am…

MySQL备份和恢复(二)mysqldump

注意&#xff1a;mysqldump是完全备份 一、mysqldump备份命令 1、 备份数据库 含创建库语句 &#xff08;1&#xff09;备份指定数据库 完全备份一个或多个完整的库&#xff0c; mysqldump -uroot -p[密码] --databases 库名1 [库名2].. >/备份路径/备份文件名.sql#导出…

安泰高压放大器电路设计方案是什么

高压放大器是电子设备中常用的一种放大器类型&#xff0c;用于将低电压信号放大到高电压输出。本文将介绍高压放大器电路设计的基本原理和方案&#xff0c;涵盖关键设计考虑因素以及常用的电路拓扑结构。 一、设计考虑因素 放大倍数&#xff1a;高压放大器的设计首要考虑因素是…

【海贼王编程冒险 - C语言海上篇】自定义类型:结构体,枚举,联合怎样定义?如何使用?

目录 1 -> 结构体的声明 1.1 -> 结构的基础知识 1.2 -> 结构的声明 1.3 -> 特殊的声明 1.4 -> 结构的自引用 1.5 -> 结构体变量的定义与初始化 1.6 -> 结构体内存对齐 1.7 -> 修改默认对齐数 1.8 -> 结构体传参 2 -> 位段 2.1 -> …

山体滑坡在线安全监测预警系统(解决方案)

在近年来&#xff0c;随着全球气候变化的影响&#xff0c;山体滑坡等自然灾害频发&#xff0c;给人们的生命财产安全带来了严重威胁。为了有效预防和减少山体滑坡带来的危害&#xff0c;许多地方开始在山上安装山体滑坡在线安全监测预警系统&#xff08;解决方案&#xff09;。…

代码编写大模型

Code Llama 70B 提供与之前发布的 Code Llama 型号相同的三个版本&#xff1a; CodeLlama - 70B&#xff0c;基础代码模型&#xff1b;CodeLlama - 70B - Python&#xff0c;专门面向 Python 的 70B&#xff1b;Code Llama - 70B - Instruct 70B&#xff0c;它针对理解自然语言…

【gulp+jq+html】添加环境变量,并在js中使用(判断环境,更改api接口域名)+ 附gulpfile.js代码

参考博文&#xff1a; gulp分离环境 gulp中如何配置环境变量 gulp环境变量配置 1、安装cross-env插件 npm install cross-env -d2、package.json更改scripts "scripts": {"clean": "gulp clean","serve:test": "cross-env NODE…

503 Service Temporarily Unavailable nginx 原因和解决办法

前言 HTTP 503 Service Temporarily Unavailable 错误通常表示服务器无法处理请求&#xff0c;可能是由于服务器过载、维护或其他临时性问题导致的。在 Nginx 中&#xff0c;这种错误通常与后端服务的可用性问题相关。以下是可能的原因和解决办法&#xff1a; 正文…

UE4 CustomDepthMobile流程小记

原生UE opaque材质中获取CustomDepth/CustomStencil会报错 在其Compile中调用的函数中没有看到报错逻辑 材质节点的逻辑都没有什么问题&#xff0c;所以看一下报错 在HLSLMaterialTranslator::Translate中 修改之后 mobile流程的不透明材质可以直接获取SceneTexture::customd…

知识点积累系列(一)golang语言篇【持续更新】

云原生学习路线导航页&#xff08;持续更新中&#xff09; 本文是 知识点积累 系列文章的第一篇&#xff0c;记录golang语言相关的知识点 1.结构体的mapstructure是什么 mapstructure:"default" mapstructure是一个Go语言的库&#xff0c;用于将一个map中的值映射到…

“量子+半导体”!罗姆半导体与量子公司Quanmatic进行应用探索

​内容来源&#xff1a;量子前哨&#xff08;ID&#xff1a;Qforepost&#xff09; 编辑丨慕一 编译/排版丨琳梦 卉可 深度好文&#xff1a;1500字丨10分钟阅读 2023年&#xff0c;日本半导体制造商Rohm&#xff08;罗姆&#xff09;与量子算法解决方案公司Quanmatic达成合作…

StarRocks -- 基础概念(数据模型及分区分桶)

1. 数据模型 StarRocks提供四种数据模型&#xff1a; Duplicate Key, Aggregate Key, Unique Key, Primary Key 1.1 Duplicate Key 适用场景&#xff1a; 分析原始数据&#xff0c;如原始日志和原始操作记录。可以使用多种方法查询数据&#xff0c;不受预聚合方法的限制。加…

【Linux】信号量

信号量 一、POSIX信号量1、信号量的原理2、信号量的概念&#xff08;1&#xff09;PV操作必须是原子操作&#xff08;2&#xff09;申请信号量失败被挂起等待 3、信号量函数4、销毁信号量5、等待信号量&#xff08;申请信号量&#xff09;6、发布信号量&#xff08;释放信号量&…

高校教学方法论简述

简述。 背景 高校教师的任务&#xff1a;教学、科研、服务、辅导等&#xff1b;高校教师通常缺乏课程与教学的专业基础&#xff1b;应用型高校的课程教学、科研不同于学术型高校&#xff1b;民办应用型高校教师如何安身立命&#xff1f;应用型高校的专业课程与教学特色 教完--…

02-opencv简单实例效果和基本介绍-上

机器视觉概述 机器视觉是人工智能正在快速发展的一个分支。简单说来,机器视觉就是用机器代替人眼来做测量和判断。机器视觉系统是通过机器视觉产品(即图像摄取装置,分CMOS和CCD两种)将被摄取目标转换成图像信号,传送给专用的图像处理系统,得到被摄目标的形态信息,根据像素…

Linux Archcraft结合内网穿透实现SSH远程连接

文章目录 1. 本地SSH连接测试2. Archcraft安装Cpolar3. 配置 SSH公网地址4. 公网远程SSH连接5. 固定SSH公网地址6. SSH固定地址连接7. 结语 Archcraft是一个基于Arch Linux的Linux发行版&#xff0c;它使用最简主义的窗口管理器而不是功能齐全的桌面环境来提供图形化用户界面。…

电加热热水器上架亚马逊美国站需要的UL174报告

电加热热水器上架亚马逊美国站需要的UL174报告 家用热水器出口美国需要办理UL174测试报告。 热水器就是指通过各种物理原理&#xff0c;在一定时间内使冷水温度升高变成热水的一种装置。分为制造冷气部分和制造热水部分。其实这两个部分又是紧密地联系在一起&#xff0c;密不可…

flask+django基于python的网上美食订餐系统_3lyq1

设计旨在提高顾客就餐效率、优化餐厅管理、提高订单准确性和客户的满意度。本系统采用 Python 语言作为开发语言&#xff0c;采用Django框架及其第三方库和第三方工具来进行开发。该方案分为管理员功能模块&#xff0c;商家功能模块以及用户前后功能模块三部分。开发前期根据用…