情境:逐字稿 公司提供開完會後之WAV錄音檔,想轉換成「純文字檔」,節省打字時間
需預先載入:語音辨識相關套件
pip install SpeechRecognition
以下為PYTHON程式:
import os,sys
import subprocess
import speech_recognition as sr
# language='ja-JP' 可改成辨識日文語系; language='en-US' 可改成辨識英文語系
def convert_audio_to_text(audio_file):
recognizer = sr.Recognizer()
with sr.AudioFile(audio_file) as source:
audio = recognizer.record(source)
try:
text = recognizer.recognize_google(audio, language='zh-TW')
return text
except sr.UnknownValueError:
return "無法辨識音訊"
except sr.RequestError:
return "無法連接到語音識別服務"
#'開啟GUI 取得來源檔(準備欲轉換聲音來源,開啟檔案對話視窗宣告處理)
import tkinter as tk
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
# 取得轉換聲音,轉成純文字檔來源資訊
file_path = filedialog.askopenfilename(initialdir = "/",title = "Select file for OCR (選擇欲辨識聲音檔轉文字檔)",filetypes = (("Audio files","*.wav"),("All files","*.*")))
text1 = convert_audio_to_text(file_path)
print(text1)
with open('wavfile.txt', mode = 'w') as f:
f.write(text1)
f.close()
#將剛產出之文字檔,直接開啟顯示出來
subprocess.Popen('explorer "wavfile.txt"')
其它辦識資源 聲音轉換成純文字:
OpenAI (whisper)
其它轉換成文字相關工具:
圖檔轉換成文字檔 (Jpg to txt)