Whisper is a general-purpose speech recognition model. It is trained on a large
dataset of diverse audio and is also a multitasking model that can perform
multilingual speech recognition, speech translation, and language
identification.