- Published on
Faster whisper
- Authors
- Name
- DW
Faster-whisper
fast-whisper 是使用 CTranslate2 重新实现 OpenAI 的 Whisper 模型,CTranslate2 是 Transformer 模型的快速推理引擎。 此实现速度比 openai/whisper 快 4 倍,并且使用更少的内存,但具有相同的精度。通过 CPU 和 GPU 上的 8 位量化,可以进一步提高效率。
用法
from faster_whisper import WhisperModel
model_size = "large-v3"
# Run on GPU with FP16
model = WhisperModel(model_size, device="cuda", compute_type="float16")
# or run on GPU with INT8
# model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
# or run on CPU with INT8
# model = WhisperModel(model_size, device="cpu", compute_type="int8")
segments, info = model.transcribe("audio.mp3", beam_size=5)
print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
注意:segments 是一个生成器,因此只有在迭代它时才会开始转录。可以通过将片段收集到列表或 for 循环中来运行完成:
segments, _ = model.transcribe("audio.mp3")
segments = list(segments) # The transcription will actually run here.
Links
https://github.com/SYSTRAN/faster-whisper
https://github.com/openai/whisper