Chapter 9: Modern ASR Architectures — CTC, Attention, and Transformers
Voice and AI, Chapter 9: how end-to-end speech recognition handles alignment implicitly — CTC, attention-based models, transformers and self-attention, streaming constraints, and why architecture choice depends on use case.