第24章：语音识别与控制

“Alexa, turn on the light.” 语音已成为智能家居最自然的入口。

24.1 语音识别基础

**ASR (Automatic Speech Recognition)**：将语音转换为文本 (STT)。
- 流程：音频采集 -> 降噪/回声消除 -> 特征提取 -> 声学模型 -> 语言模型 -> 文本。
**TTS (Text-to-Speech)**：将文本转换为语音。

24.2 离线 vs 在线

在线识别：精度高，依赖网络。如百度语音 API、科大讯飞。
离线识别：响应快，保护隐私，但词库有限。适合固定命令词（如“打开空调”）。
- 方案：轻量级 ASR 芯片（如启英泰伦）。

24.3 语音助手开发

核心组件：

**唤醒词检测 (Wake Word Detection)**：低功耗监听特定词（如“小爱同学”）。
**意图识别 (NLU)**：理解用户想干什么。
- 用户说：“太热了” -> 意图：adjust_temperature，槽位：action=lower。
**对话管理 (DM)**：维护上下文（多轮对话）。

24.4 项目实战：简易语音助手

使用 Python 和 SpeechRecognition 库实现。

import speech_recognition as sr
import pyttsx3

# 初始化 TTS 引擎
engine = pyttsx3.init()

def speak(text):
    engine.say(text)
    engine.runAndWait()

# 听取指令
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Listening...")
    audio = r.listen(source)

try:
    command = r.recognize_google(audio)
    print(f"You said: {command}")
    
    if "light on" in command:
        # TODO: 发送 MQTT 开灯指令
        speak("Turning on the light.")
        
except sr.UnknownValueError:
    print("Could not understand audio")

至此，第六部分“AI与物联网融合”已全部完成。接下来，我们将通过第七部分：实战项目，将所学知识融会贯通。