Speech, audio, and AI music

Voice leader

ElevenLabs

ElevenLabs

Create lifelike speech with our AI voice generator and voice agents platform.

Closed Source / PlatformFree / SubscriptionAPI
Voice & AudioAPITTS
AI music

Suno

Suno

Create stunning original music for free in seconds using our AI generator.

Closed Source / PlatformFree / SubscriptionNo API
Voice & AudioMusic GenerationAI music
music composition

AIVA

AIVA

AIVA is a Speech, audio, and AI music product from AIVA, focused on music composition with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
enterprise speech

Alibaba Cloud Speech

阿里云

Alibaba Cloud Speech is a Speech recognition product from 阿里云, focused on enterprise speech with tags such as ASR, Speech API, China ecosystem.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIChina ecosystem
enterprise narration

Amazon Polly

AWS

Amazon Polly is a Text to speech product from AWS, focused on enterprise narration with tags such as TTS, Voice API, API.

closed-source / platformfree trial / usage-basedAPI
TTSVoice APIAPI
cloud transcription

Amazon Transcribe

AWS

Amazon Transcribe is a Speech recognition product from AWS, focused on cloud transcription with tags such as ASR, Speech API, API.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIAPI
transcription platform

AssemblyAI

AssemblyAI

AssemblyAI is a Speech recognition product from AssemblyAI, focused on transcription platform with tags such as ASR, Speech API, API.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIAPI
AI music

AudioGen Medium

Meta

AudioGen Medium is a Speech, audio, and AI music product from Meta, focused on AI music with tags such as AI music, Open source.

open-source / self-hostedopen source / self-hostedNo API
AI musicOpen source
enterprise speech

Azure AI Speech

Microsoft

Azure AI Speech is a Speech recognition product from Microsoft, focused on enterprise speech with tags such as ASR, Speech API, API.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIAPI
enterprise voice

Azure Text to Speech

Microsoft

Azure Text to Speech is a Text to speech product from Microsoft, focused on enterprise voice with tags such as TTS, Voice API, API.

closed-source / platformfree trial / usage-basedAPI
TTSVoice APIAPI
cloud ASR

Baidu Speech

百度智能云

Baidu Speech is a Speech recognition product from 百度智能云, focused on cloud ASR with tags such as ASR, Speech API, China ecosystem.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIChina ecosystem
royalty-friendly music

Beatoven.ai

Beatoven

Beatoven.ai is a Speech, audio, and AI music product from Beatoven, focused on royalty-friendly music with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
one-click music generation

Boomy

Boomy

Boomy is a Speech, audio, and AI music product from Boomy, focused on one-click music generation with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
speech recognition

Canary 1B

NVIDIA

Canary 1B is a Speech recognition product from NVIDIA, focused on speech recognition with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
realtime voice

Cartesia

Cartesia

Cartesia is a Text to speech product from Cartesia, focused on realtime voice with tags such as TTS, Voice API, API.

closed-source / platformfree trial / usage-basedAPI
TTSVoice APIAPI
open-source TTS

Chatterbox TTS

Resemble AI

Chatterbox TTS is a Text to speech product from Resemble AI, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
open-source TTS

Coqui TTS

Coqui

Coqui TTS is a Text to speech product from Coqui, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
open-source TTS

CosyVoice

阿里通义

CosyVoice is a Text to speech product from 阿里通义, focused on open-source TTS with tags such as TTS, Open source, China ecosystem.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen sourceChina ecosystem
open-source TTS

CosyVoice 2

阿里通义

CosyVoice 2 is a Text to speech product from 阿里通义, focused on open-source TTS with tags such as TTS, Open source, China ecosystem.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen sourceChina ecosystem
open-source TTS

CSM-1B

Sesame

CSM-1B is a Text to speech product from Sesame, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
speech API

Deepgram

Deepgram

Deepgram is a Speech recognition product from Deepgram, focused on speech API with tags such as ASR, Speech API, Realtime.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIRealtime
Edit suite

Descript

Descript

Descript makes editing video and audio as easy as editing text.

Closed Source / PlatformFree / SubscriptionNo API
Voice & AudioASRAI music
voice localization

Dubverse

Dubverse

Dubverse is a Text to speech product from Dubverse, focused on voice localization with tags such as TTS, Translation, Workflow.

closed-source / platformfree trial / usage-basedAPI
TTSTranslationWorkflow
open-source TTS

F5-TTS

SWivid

F5-TTS is a Text to speech product from SWivid, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
meeting transcription

Fireflies AI

Fireflies

Fireflies AI is a Speech recognition product from Fireflies, focused on meeting transcription with tags such as ASR, Workflow, Knowledge base.

closed-source / platformfree trial / usage-basedAPI
ASRWorkflowKnowledge base
voice clone

Fish Audio

Fish Audio

Fish Audio is a Text to speech product from Fish Audio, focused on voice clone with tags such as TTS, Voice clone, API.

closed-source / platformfree trial / usage-basedAPI
TTSVoice cloneAPI
speech recognition framework

FunASR

阿里通义

FunASR is a Speech recognition product from 阿里通义, focused on speech recognition framework with tags such as ASR, Open source, China ecosystem.

open-source / self-hostedopen source / self-hostedAPI
ASROpen sourceChina ecosystem
multilingual transcription

Gladia

Gladia

Gladia is a Speech recognition product from Gladia, focused on multilingual transcription with tags such as ASR, Speech API, API.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIAPI
cloud ASR

Google Cloud Speech-to-Text

Google

Google Cloud Speech-to-Text is a Speech recognition product from Google, focused on cloud ASR with tags such as ASR, Speech API, API.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIAPI
cloud TTS

Google Cloud Text-to-Speech

Google

Google Cloud Text-to-Speech is a Text to speech product from Google, focused on cloud TTS with tags such as TTS, Voice API, API.

closed-source / platformfree trial / usage-basedAPI
TTSVoice APIAPI
Chinese ASR

iFlytek Open Platform

科大讯飞

iFlytek Open Platform is a Speech recognition product from 科大讯飞, focused on Chinese ASR with tags such as ASR, Speech API, China ecosystem.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIChina ecosystem
singing and voice clone

Kits AI

Kits AI

Kits AI is a Text to speech product from Kits AI, focused on singing and voice clone with tags such as TTS, Voice clone, AI music.

closed-source / platformfree trial / usage-basedAPI
TTSVoice cloneAI music
open-source TTS

Kokoro 82M

Hexgrad

Kokoro 82M is a Text to speech product from Hexgrad, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
speech recognition

Kyutai STT 1B

Kyutai

Kyutai STT 1B is a Speech recognition product from Kyutai, focused on speech recognition with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
commercial music generation

Loudly

Loudly

Loudly is a Speech, audio, and AI music product from Loudly, focused on commercial music generation with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
Brand voice

LOVO

LOVO

Award-winning AI Voice Generator and text to speech software with 500+ voices in 100 languages.

Closed Source / PlatformFree / SubscriptionNo API
Voice & AudioTTSAI music
AI music

Lyria 2

Google

Lyria 2 is a Speech, audio, and AI music product from Google, focused on AI music with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
stem separation

Moises

Moises

Moises is a Speech, audio, and AI music product from Moises, focused on stem separation with tags such as Audio editing, Workflow.

closed-source / platformfree / paidNo API
Audio editingWorkflowAI music
open-source ASR

Moonshine ASR

Moonshine

Moonshine ASR is a Speech recognition product from Moonshine, focused on open-source ASR with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
streaming music generation

Mubert

Mubert

Mubert is a Speech, audio, and AI music product from Mubert, focused on streaming music generation with tags such as AI music, API.

closed-source / platformfree / paidAPI
AI musicAPI
Commercial voice

Murf

Murf

Murf is a voice and audio product from Murf, focused on commercial voice workflows and official access.

Closed Source / PlatformFree / SubscriptionNo API
Voice & AudioTTSAI music
AI singing

Musicfy

Musicfy

Musicfy is a Speech, audio, and AI music product from Musicfy, focused on AI singing with tags such as AI music, Voice clone.

closed-source / platformfree / paidNo API
AI musicVoice clone
AI music

MusicGen Large

Meta

MusicGen Large is a Speech, audio, and AI music product from Meta, focused on AI music with tags such as AI music, Open source.

open-source / self-hostedopen source / self-hostedNo API
AI musicOpen source
AI music

MusicLM

Google

MusicLM is a Speech, audio, and AI music product from Google, focused on AI music with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
scripted narration

Narakeet

Narakeet

Narakeet is a Text to speech product from Narakeet, focused on scripted narration with tags such as TTS, Workflow.

closed-source / platformfree trial / usage-basedAPI
TTSWorkflow
open-source voice clone

OpenVoice

MyShell

OpenVoice is a Text to speech product from MyShell, focused on open-source voice clone with tags such as TTS, Voice clone, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSVoice cloneOpen source
speech recognition

Parakeet RNNT 1.1B

NVIDIA

Parakeet RNNT 1.1B is a Speech recognition product from NVIDIA, focused on speech recognition with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
open-source TTS

Piper TTS

Piper

Piper TTS is a Text to speech product from Piper, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
dialogue voice

PlayDialog

PlayDialog

PlayDialog is a Text to speech product from PlayDialog, focused on dialogue voice with tags such as TTS, Voice API, API.

closed-source / platformfree trial / usage-basedAPI
TTSVoice APIAPI
TTS API

PlayHT

PlayHT

PlayHT is a voice and audio product from PlayHT, focused on tts api workflows and official access.

Closed Source / PlatformFree / SubscriptionAPI
Voice & AudioAPITTS
podcast workflow

Podcastle

Podcastle

Podcastle is a Speech, audio, and AI music product from Podcastle, focused on podcast workflow with tags such as Audio editing, Workflow.

closed-source / platformfree / paidNo API
Audio editingWorkflowAI music
brand voiceover

ReadSpeaker

ReadSpeaker

ReadSpeaker is a Text to speech product from ReadSpeaker, focused on brand voiceover with tags such as TTS, Voice clone.

closed-source / platformfree trial / usage-basedAPI
TTSVoice clone
Voice clone

Resemble AI

Resemble AI

Resemble AI helps enterprises generate secure voice AI, verify proper usage, and detect deepfakes instantly.

Closed Source / PlatformFree / SubscriptionAPI
Voice & AudioAPITTS
speech recognition API

Rev AI

Rev

Rev AI is a Speech recognition product from Rev, focused on speech recognition API with tags such as ASR, Speech API, API.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIAPI
experimental music generation

Riffusion

Riffusion

Riffusion is a Speech, audio, and AI music product from Riffusion, focused on experimental music generation with tags such as AI music, Open source.

open-source / self-hostedopen source / self-hostedNo API
AI musicOpen source
AI music

Riffusion Fuzz

Riffusion

Riffusion Fuzz is a Speech, audio, and AI music product from Riffusion, focused on AI music with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
multilingual speech recognition

SeamlessM4T

Meta

SeamlessM4T is a Speech recognition product from Meta, focused on multilingual speech recognition with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
open-source speech recognition

SenseVoice

阿里通义

SenseVoice is a Speech recognition product from 阿里通义, focused on open-source speech recognition with tags such as ASR, Open source, China ecosystem.

open-source / self-hostedopen source / self-hostedAPI
ASROpen sourceChina ecosystem
open-source ASR

Sherpa ONNX ASR

Sherpa

Sherpa ONNX ASR is a Speech recognition product from Sherpa, focused on open-source ASR with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
low-latency ASR

Soniox

Soniox

Soniox is a Speech recognition product from Soniox, focused on low-latency ASR with tags such as ASR, Realtime, Speech API.

closed-source / platformfree trial / usage-basedAPI
ASRRealtimeSpeech API
background music generation

Soundraw

Soundraw

Soundraw is a Speech, audio, and AI music product from Soundraw, focused on background music generation with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
open-source TTS

Spark-TTS

SparkAudio

Spark-TTS is a Text to speech product from SparkAudio, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
open-source ASR

SpeechBrain ASR

SpeechBrain

SpeechBrain ASR is a Speech recognition product from SpeechBrain, focused on open-source ASR with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
Read aloud

Speechify

Speechify

Speechify reads anything aloud to you.

Closed Source / PlatformFree / SubscriptionNo API
Voice & AudioTTSAI music
enterprise ASR

Speechmatics

Speechmatics

Speechmatics is a Speech recognition product from Speechmatics, focused on enterprise ASR with tags such as ASR, Speech API, API.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIAPI
AI music

Stable Audio

Stability AI

Stable Audio is a Speech, audio, and AI music product from Stability AI, focused on AI music with tags such as AI music, Audio editing.

closed-source / platformfree / paidNo API
AI musicAudio editing
AI audio

Stable Audio Open

Stability AI

Stable Audio Open is a Speech, audio, and AI music product from Stability AI, focused on AI audio with tags such as AI music, Audio editing.

closed-source / platformfree / paidNo API
AI musicAudio editing
open-source TTS

StyleTTS2

StyleTTS

StyleTTS2 is a Text to speech product from StyleTTS, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
AI music

Suno v4

Suno

Suno v4 is a Speech, audio, and AI music product from Suno, focused on AI music with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
AI music

Suno v4.5

Suno

Suno v4.5 is a Speech, audio, and AI music product from Suno, focused on AI music with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
speech API

Tencent Cloud ASR

腾讯云

Tencent Cloud ASR is a Speech recognition product from 腾讯云, focused on speech API with tags such as ASR, Speech API, China ecosystem.

closed-source / platformfree trial / usage-basedAPI
ASRSpeech APIChina ecosystem
open-source TTS

Tortoise TTS

Tortoise

Tortoise TTS is a Text to speech product from Tortoise, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
online voiceover

TTSMaker

TTSMaker

TTSMaker is a Text to speech product from TTSMaker, focused on online voiceover with tags such as TTS.

closed-source / platformfree trial / usage-basedAPI
TTS
character voiceover

Typecast

Typecast

Typecast is a Text to speech product from Typecast, focused on character voiceover with tags such as TTS, Voice clone.

closed-source / platformfree trial / usage-basedAPI
TTSVoice clone
Music generation

Udio

Udio

Discover, create, and share music with the world.

Closed Source / PlatformFree / SubscriptionNo API
Voice & AudioMusic GenerationAI music
AI music

Udio 1.3

Udio

Udio 1.3 is a Speech, audio, and AI music product from Udio, focused on AI music with tags such as AI music.

closed-source / platformfree / paidNo API
AI music
Enterprise TTS

WellSaid Labs

WellSaid Labs

WellSaid Labs is a voice and audio product from WellSaid Labs, focused on enterprise tts workflows and official access.

Closed Source / PlatformFree / SubscriptionNo API
Voice & AudioTTSAI music
open ASR

Whisper

OpenAI

Whisper is a Speech recognition product from OpenAI, focused on open ASR with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
open-source ASR

Whisper Large V3

OpenAI

Whisper Large V3 is a Speech recognition product from OpenAI, focused on open-source ASR with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
open-source ASR

Whisper Large V3 Turbo

OpenAI

Whisper Large V3 Turbo is a Speech recognition product from OpenAI, focused on open-source ASR with tags such as ASR, Open source.

open-source / self-hostedopen source / self-hostedAPI
ASROpen source
open-source TTS

XTTS v2

Coqui

XTTS v2 is a Text to speech product from Coqui, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
open-source TTS

Zonos TTS

Zyphra

Zonos TTS is a Text to speech product from Zyphra, focused on open-source TTS with tags such as TTS, Open source.

open-source / self-hostedopen source / self-hostedAPI
TTSOpen source
Selection guide

How to choose Speech, audio, and AI music

  • Separate TTS, dubbing, music generation, and voice cloning before comparing tools.
  • For dubbing, listen for naturalness and phrasing before checking price; ads and narration suffer most from robotic delivery.
  • Multilingual projects should focus on language coverage, accent quality, and subtitle workflow support.
  • Before commercial use, verify voice licensing, consent for cloning, and content copyright.

What matters first on Speech, audio, and AI music category pages?

Start with official access, pricing model, API support, open/closed status, and common use cases.