speech2text

Automatic speech-to-text conversion
7.8K Downloads
Updated 25 Apr 2024

View License

Automate labeling and tagging of speech recordings, assess the performance of DSP pipelines for voice and speech enhancement, run text analytics on voice recordings, and more.
This entry enables you to convert sampled speech recordings available as MATLAB vectors into strings using a single function call. Starting from MATLAB release R2022a, this also enables you to perform speech transcription interactively using the Signal Labeler app.
You will need a license of Audio Toolbox, an internet connection, and an active subscription to a speech-to-text service of your choice – Google™ Cloud Speech-to-Text API, IBM™ Watson Speech to Text API, Microsoft™ Azure Speech Services API, or Amazon™ Transcribe. Amazon Transcribe™ requires R2022b or later.
Starting in MATLAB R2022b, you can use speech2text with a pretrained wav2vec 2.0 model without needing to download this functionality from File Exchange. For more information, see: https://www.mathworks.com/help/audio/ref/speech2text.html
See the Examples tab for detailed instructions on how to get started.
See also: Automatic text-to-speech synthesis (text2speech) https://www.mathworks.com/matlabcentral/fileexchange/73326-text2speech

Cite As

MathWorks Audio Toolbox Team (2024). speech2text (https://www.mathworks.com/matlabcentral/fileexchange/65266-speech2text), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2024a
Compatible with R2021a and later releases
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

speechclients/setup

speechclients/examples

Version Published Release Notes
1.2.24

Added example

1.2.23

Bug fix for 24a release. Supporting new syntax.

1.2.22

re-architect for better source control

1.2.21

Remove /en folder from path

1.2.20

Fix for Amazon client when using toolbox installation.

1.2.19

Bug fix for Amazon speech client

1.2.18

Added Support for Amazon Transcribe.

1.2.17

- edited example live script

1.2.16

- update description and mlx file

1.2.15

Added Audio Toolbox as required product.

1.2.14

Improved command-line help for speechClient.

1.2.13

Fixed bug in HTTPTimeOut argument parsing of speech2text. The bug was introduced in 1.2.12

1.2.12

Updated to work with R2022b speech2text support for wav2vec 2.0.

1.2.11

Reacted to change in IBM url

1.2.10

Fixed diarization for Google client

1.2.9

Includes missing IBM authentication steps, which was causing errors for old credentials.

1.2.8

Re-uploaded as a toolbox file

1.2.7

Re-uploading to fix corrupt toolbox file

1.2.6

Handle the new authentication token format for Microsoft API.

1.2.5

Added link to text2speech

1.2.4

Better error handling when Audio Toolbox license is not available

1.2.3

Allow specifying a custom recognize URL for Google client. This provides a way to use beta versions of Google Cloud Speech-to-Text API.

1.2.2

Prevent adding the setup script to MATLAB path

1.2.1

Typo fix

1.2.0

Added support for interactive speech to text transcription using Audio Labeler in MATLAB release R2019b

1.1.5.0

Addressed compatibility issues in older MATLAB releases (R2017a and R2017b)

1.1.4.0

Added support for new authentications schemes for IBM and Microsoft APIs.

1.1.3.0

Corrected path update on install

1.1.2.0

Improved handling of errors and lack of data in responses when using Microsoft API.

1.1.1.0

Updates for changes to IBM API

1.1.0.0

Added files under Files/en to enable cmd line help for p-coded files.
Added HTTPTimeOut option to allow using longer speech recordings.
Added error message to better handle a scenario where an HTTP request is successful but the API does not return any transcription data

1.0.0.0