What is the Speech SDK?

Article
01/22/2024

The Speech SDK (software development kit) exposes many of the Speech service capabilities, so you can develop speech-enabled applications. The Speech SDK is available in many programming languages and across platforms. The Speech SDK is ideal for both real-time and non-real-time scenarios, by using local devices, files, Azure Blob Storage, and input and output streams.

In some cases, you can't or shouldn't use the Speech SDK. In those cases, you can use REST APIs to access the Speech service. For example, use the Speech to text REST API for batch transcription and custom speech.

Supported languages

The Speech SDK supports the following languages and platforms:

Programming language	Reference	Platform support
C# ¹	.NET	Windows, Linux, macOS, Mono, Xamarin.iOS, Xamarin.Mac, Xamarin.Android, UWP, Unity
C++ ²	C++	Windows, Linux, macOS
Go	Go	Linux
Java	Java	Android, Windows, Linux, macOS
JavaScript	JavaScript	Browser, Node.js
Objective-C	Objective-C	iOS, macOS
Python	Python	Windows, Linux, macOS
Swift	Objective-C ³	iOS, macOS

^{1 C# code samples are available in the documentation. The Speech SDK for C# is based on .NET Standard 2.0, so it supports many platforms and programming languages. For more information, see .NET implementation support.}
^{2 C isn't a supported programming language for the Speech SDK.}
^{3 The Speech SDK for Swift shares client libraries and reference documentation with the Speech SDK for Objective-C.}

Important

By downloading any of the Azure AI Speech SDKs, you acknowledge its license. For more information, see:

Speech SDK demo

The following video shows how to install the Speech SDK for C# and write a .NET console application for speech to text.

Code samples

Speech SDK code samples are available in the documentation and GitHub.

Docs samples

At the top of documentation pages that contain samples, options to select include C#, C++, Go, Java, JavaScript, Objective-C, Python, or Swift.

Screenshot showing how to select a programming language in the documentation.

If a sample isn't available in your preferred programming language, you can select another programming language to get started and learn about the concepts, or see the reference and samples linked from the beginning of the article.

GitHub samples

In depth samples are available in the Azure-Samples/cognitive-services-speech-sdk repository on GitHub. There are samples for C# (including UWP, Unity, and Xamarin), C++, Java, JavaScript (including Browser and Node.js), Objective-C, Python, and Swift. Code samples for Go are available in the Microsoft/cognitive-services-speech-sdk-go repository on GitHub.

Help options

The Microsoft Q&A and Stack Overflow forums are available for the developer community to ask and answer questions about Azure Cognitive Speech and other services. Microsoft monitors the forums and replies to questions that the community hasn't yet answered. To make sure that we see your question, tag it with 'azure-speech'.

You can suggest an idea or report a bug by creating an issue on GitHub:

See also Azure AI services support and help options to get support, stay up-to-date, give feedback, and report bugs for Azure AI services.

Share via