fix(sarvam-tts): correct mime_type from audio/mp3 to audio/wav by shmundada93 · Pull Request #5086 · livekit/agents

shmundada93 · 2026-03-11T20:30:00Z

Summary

The Sarvam TTS API returns WAV (RIFF) audio data, but the plugin declares mime_type="audio/mp3" in both ChunkedStream (line 670) and SynthesizeStream (line 710). This causes the LiveKit audio decoder to attempt MP3 decoding on WAV data, resulting in:

av.error.InvalidDataError: Invalid data found when processing input: 'avcodec_send_packet()'

followed by:

APIError: no audio frames were pushed for text: <text>

This affects all Sarvam TTS models (bulbul:v2, bulbul:v3-beta, bulbul:v3).

Root cause

Verified by calling the Sarvam REST API directly and inspecting the raw response:

raw = base64.b64decode(audios[0])
raw[:4]  # b'RIFF' — WAV header, not MP3

All models and sample rates (8000, 16000, 22050, 24000) return RIFF/WAV audio.

Fix

Change mime_type="audio/mp3" → mime_type="audio/wav" in both:

ChunkedStream._run() (HTTP batch path)
SynthesizeStream._run() (WebSocket streaming path)

Test plan

Tested with bulbul:v2 (anushka, en-IN) — audio decodes correctly
Tested with bulbul:v3 (shubh, hi-IN) — audio decodes correctly
Tested with bulbul:v3 (ritu, en-IN) — audio decodes correctly
Tested with bulbul:v3 + temperature=0.3 — audio decodes correctly
Tested with bulbul:v3 + enable_preprocessing=True — audio decodes correctly

The Sarvam TTS API returns WAV (RIFF) audio data, but the plugin declares mime_type="audio/mp3" in both ChunkedStream and SynthesizeStream. This causes the LiveKit audio decoder to attempt MP3 decoding on WAV data, resulting in: av.error.InvalidDataError: Invalid data found when processing input Confirmed by inspecting raw API responses — all Sarvam TTS endpoints (bulbul:v2, v3-beta, v3) return base64-encoded WAV with RIFF headers. This fix updates both emission points to use the correct mime_type.

CLAassistant · 2026-03-11T20:30:08Z

All committers have signed the CLA.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

davidzhao

lg

devin-ai-integration bot reviewed Mar 11, 2026

View reviewed changes

davidzhao approved these changes Mar 12, 2026

View reviewed changes

davidzhao merged commit 7f1a351 into livekit:main Mar 12, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sarvam-tts): correct mime_type from audio/mp3 to audio/wav#5086

fix(sarvam-tts): correct mime_type from audio/mp3 to audio/wav#5086
davidzhao merged 1 commit intolivekit:mainfrom
shmundada93:fix/sarvam-tts-mime-type

shmundada93 commented Mar 11, 2026

Uh oh!

CLAassistant commented Mar 11, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

davidzhao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shmundada93 commented Mar 11, 2026

Summary

Root cause

Fix

Test plan

Uh oh!

CLAassistant commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

davidzhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Mar 11, 2026 •

edited

Loading