Name	Name	Last commit message	Last commit date
Latest commit History 324 Commits
.github	.github
.pre-commit-hooks	.pre-commit-hooks
docs	docs
examples	examples
src	src
tests	tests
.codecov.yml	.codecov.yml
.env.example	.env.example
.gitignore	.gitignore
.pre-commit-config.yaml	.pre-commit-config.yaml
CONTRIBUTING.md	CONTRIBUTING.md
DOCKER_BENCHMARK.md	DOCKER_BENCHMARK.md
Dockerfile	Dockerfile
LICENSE	LICENSE
MANIFEST.in	MANIFEST.in
Makefile	Makefile
README.md	README.md
RELEASE.md	RELEASE.md
docker-compose.yml	docker-compose.yml
pyproject.toml	pyproject.toml
uv.lock	uv.lock

Palabra AI Python SDK

Python SDK for Palabra AI's real-time speech-to-speech translation API Break down language barriers and enable seamless communication across 25+ languages

Overview

The Palabra AI Python SDK provides a high-level API for integrating real-time speech-to-speech translation into your Python applications.

What can Palabra.ai do?

Real-time speech-to-speech translation with near-zero latency
Auto voice cloning - speak any language in YOUR voice
Two-way simultaneous translation for live discussions
Developer API/SDK for building your own apps
Works everywhere - Zoom, streams, events, any platform
Zero data storage - your conversations stay private

This SDK focuses on making real-time translation simple and accessible:

Uses WebRTC and WebSockets under the hood
Abstracts away all complexity
Simple configuration with source/target languages
Supports multiple input/output adapters (microphones, speakers, files, buffers)

How it works:

Configure input/output adapters
SDK handles the entire pipeline
Automatic transcription, translation, and synthesis
Real-time audio stream ready for playback

All with just a few lines of code!

Installation

From PyPI

pip install palabra-ai

macOS SSL Certificate Setup

If you encounter SSL certificate errors on macOS like:

SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate

Option 1: Install Python certificates (recommended)

/Applications/Python\ $(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')")/Install\ Certificates.command

Option 2: Use system certificates

pip install pip-system-certs

This will configure Python to use your system's certificate store.

Quick Start

Real-time microphone translation

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang, EN, ES, DeviceManager) palabra = PalabraAI() dm = DeviceManager() mic, speaker = dm.select_devices_interactive() cfg = Config(SourceLang(EN, mic), [TargetLang(ES, speaker)]) palabra.run(cfg)

Set your API credentials as environment variables:

export PALABRA_CLIENT_ID=your_client_id export PALABRA_CLIENT_SECRET=your_client_secret

Examples

File-to-file translation

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES) palabra = PalabraAI() reader = FileReader("./speech/es.mp3") writer = FileWriter("./es2en_out.wav") cfg = Config(SourceLang(ES, reader), [TargetLang(EN, writer)]) palabra.run(cfg)

Multiple target languages

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES, FR, DE) palabra = PalabraAI() config = Config( source=SourceLang(EN, FileReader("presentation.mp3")), targets=[ TargetLang(ES, FileWriter("spanish.wav")), TargetLang(FR, FileWriter("french.wav")), TargetLang(DE, FileWriter("german.wav")) ] ) palabra.run(config)

Customizable output

Add a transcription of the source and translated speech. Configure output to provide:

Audio only
Transcriptions only
Both audio and transcriptions

from palabra_ai import ( PalabraAI, Config, SourceLang, TargetLang, FileReader, EN, ES, ) from palabra_ai.base.message import TranscriptionMessage async def print_translation_async(msg: TranscriptionMessage): print(repr(msg)) def print_translation(msg: TranscriptionMessage): print(str(msg)) palabra = PalabraAI() cfg = Config( source=SourceLang( EN, FileReader("speech/en.mp3"), print_translation # Callback for source transcriptions ), targets=[ TargetLang( ES, # You can use only transcription without audio writer if you want # FileWriter("./test_output.wav"), # Optional: audio output on_transcription=print_translation_async # Callback for translated transcriptions ) ], silent=True, # Set to True to disable verbose logging to console ) palabra.run(cfg)

Transcription output options:

1 Audio only (default):

TargetLang(ES, FileWriter("output.wav"))

2 Transcription only:

TargetLang(ES, on_transcription=your_callback_function)

3 Audio and transcription:

TargetLang(ES, FileWriter("output.wav"), on_transcription=your_callback_function)

The transcription callbacks receive TranscriptionMessage objects containing the transcribed text and metadata. Callbacks can be either synchronous or asynchronous functions.

Integrate with FFmpeg (streaming)

import io from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang, BufferReader, BufferWriter, AR, EN, RunAsPipe) ffmpeg_cmd = [ 'ffmpeg', '-i', 'speech/ar.mp3', '-f', 's16le', # 16-bit PCM '-acodec', 'pcm_s16le', '-ar', '48000', # 48kHz '-ac', '1', # mono '-' # output to stdout ] pipe_buffer = RunAsPipe(ffmpeg_cmd) es_buffer = io.BytesIO() palabra = PalabraAI() reader = BufferReader(pipe_buffer) writer = BufferWriter(es_buffer) cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)]) palabra.run(cfg) print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes") with open("./ar2en_out.wav", "wb") as f: f.write(es_buffer.getbuffer())

Using buffers

import io from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang, BufferReader, BufferWriter, AR, EN) from palabra_ai.internal.audio import convert_any_to_pcm16 en_buffer, es_buffer = io.BytesIO(), io.BytesIO() with open("speech/ar.mp3", "rb") as f: en_buffer.write(convert_any_to_pcm16(f.read())) palabra = PalabraAI() reader = BufferReader(en_buffer) writer = BufferWriter(es_buffer) cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)]) palabra.run(cfg) print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes") with open("./ar2en_out.wav", "wb") as f: f.write(es_buffer.getbuffer())

Using default audio devices

from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, DeviceManager, EN, ES dm = DeviceManager() reader, writer = dm.get_default_readers_writers() if reader and writer: palabra = PalabraAI() config = Config( source=SourceLang(EN, reader), targets=[TargetLang(ES, writer)] ) palabra.run(config)

Async Translation

import asyncio from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES async def translate(): palabra = PalabraAI() config = Config( source=SourceLang(EN, FileReader("input.mp3")), targets=[TargetLang(ES, FileWriter("output.wav"))] ) result = await palabra.arun(config) # Result contains: result.ok, result.exc, result.log_data if __name__ == "__main__": asyncio.run(translate())

Synchronous Translation

from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES # Synchronous execution (blocks until complete) palabra = PalabraAI() config = Config( source=SourceLang(EN, FileReader("input.mp3")), targets=[TargetLang(ES, FileWriter("output.wav"))] ) result = palabra.run(config) # Result contains: result.ok, result.exc, result.log_data

Signal Handling

# Enable Ctrl+C signal handlers (disabled by default) result = palabra.run(config, signal_handlers=True) # Default behavior (signal handlers disabled) result = palabra.run(config) # signal_handlers=False by default

Result Handling

Both run() and arun() return a RunResult object with status information:

result = palabra.run(config) # or: result = await palabra.arun(config) if result.ok: print(" Translation completed successfully!") if result.log_data: print(f" Processing stats: {result.log_data}") if result.eos: print(" End of stream signal received") else: print(f" Translation failed: {result.exc}")

I/O Adapters & Mixing

Available adapters

The Palabra AI SDK provides flexible I/O adapters that can combined to:

FileReader/FileWriter: Read from and write to audio files
DeviceReader/DeviceWriter: Use microphones and speakers
BufferReader/BufferWriter: Work with in-memory buffers
RunAsPipe: Run command and represent as pipe (e.g., FFmpeg stdout)

Mixing examples

Combine any input adapter with any output adapter:

Microphone to file - record translations

config = Config( source=SourceLang(EN, mic), targets=[TargetLang(ES, FileWriter("recording_es.wav"))] )

File to speaker - play translations

config = Config( source=SourceLang(EN, FileReader("presentation.mp3")), targets=[TargetLang(ES, speaker)] )

Microphone to multiple outputs

config = Config( source=SourceLang(EN, mic), targets=[ TargetLang(ES, speaker), # Play Spanish through speaker TargetLang(ES, FileWriter("spanish.wav")), # Save Spanish to file TargetLang(FR, FileWriter("french.wav")) # Save French to file ] )

Buffer to buffer - for integration

input_buffer = io.BytesIO(audio_data) output_buffer = io.BytesIO() config = Config( source=SourceLang(EN, BufferReader(input_buffer)), targets=[TargetLang(ES, BufferWriter(output_buffer))] )

FFmpeg pipe to speaker

pipe = RunAsPipe(ffmpeg_process.stdout) config = Config( source=SourceLang(EN, BufferReader(pipe)), targets=[TargetLang(ES, speaker)] )

Benchmarking

The SDK includes a powerful benchmarking module for performance analysis and quality testing. Run comprehensive benchmarks with detailed metrics, latency measurements, and trace data export.

# Quick benchmark uv run python -m palabra_ai.benchmark examples/speech/en.mp3 en es --out ./results # With Docker make bench -- examples/speech/en.mp3 en es --out ./results

See Benchmarking Guide for complete documentation including configuration options, output files, and advanced usage.

Features

Real-time translation

Translate audio streams in real-time with minimal latency Perfect for live conversations, conferences, and meetings

Voice cloning

Preserve the original speaker's voice characteristics in translations Enable voice cloning in the configuration

Device management

Easy device selection with interactive prompts or programmatic access:

dm = DeviceManager() # Interactive selection mic, speaker = dm.select_devices_interactive() # Get devices by name mic = dm.get_mic_by_name("Blue Yeti") speaker = dm.get_speaker_by_name("MacBook Pro Speakers") # List all devices input_devices = dm.get_input_devices() output_devices = dm.get_output_devices()

Audio Configuration

Sample Rates by Protocol

The SDK automatically handles audio sample rates based on the connection protocol:

WebSocket (WS) Mode

Input (to API): Always 16kHz mono PCM
Output (from API): Always 24kHz mono PCM

WebRTC Mode

Input (to API): 48kHz mono PCM
Output (from API): 48kHz mono PCM

The SDK automatically resamples audio to match these requirements regardless of your input/output device capabilities.

Supported languages

Speech recognition languages (Source)

Arabic (AR), Bashkir (BA), Belarusian (BE), Bulgarian (BG), Bengali (BN), Catalan (CA), Czech (CS), Welsh (CY), Danish (DA), German (DE), Greek (EL), English (EN), Esperanto (EO), Spanish (ES), Estonian (ET), Basque (EU), Persian (FA), Finnish (FI), French (FR), Irish (GA), Galician (GL), Hebrew (HE), Hindi (HI), Croatian (HR), Hungarian (HU), Interlingua (IA), Indonesian (ID), Italian (IT), Japanese (JA), Korean (KO), Lithuanian (LT), Latvian (LV), Mongolian (MN), Marathi (MR), Malay (MS), Maltese (MT), Dutch (NL), Norwegian (NO), Polish (PL), Portuguese (PT), Romanian (RO), Russian (RU), Slovak (SK), Slovenian (SL), Swedish (SV), Swahili (SW), Tamil (TA), Thai (TH), Turkish (TR), Uyghur (UG), Ukrainian (UK), Urdu (UR), Vietnamese (VI), Chinese (ZH)

Translation languages (Target)

Arabic (AR), Azerbaijani (AZ), Belarusian (BE), Bulgarian (BG), Bosnian (BS), Catalan (CA), Czech (CS), Welsh (CY), Danish (DA), German (DE), Greek (EL), English (EN), English Australian (EN_AU), English Canadian (EN_CA), English UK (EN_GB), English US (EN_US), Spanish (ES), Spanish Mexican (ES_MX), Estonian (ET), Finnish (FI), Filipino (FIL), French (FR), French Canadian (FR_CA), Galician (GL), Hebrew (HE), Hindi (HI), Croatian (HR), Hungarian (HU), Indonesian (ID), Icelandic (IS), Italian (IT), Japanese (JA), Kazakh (KK), Korean (KO), Lithuanian (LT), Latvian (LV), Macedonian (MK), Malay (MS), Dutch (NL), Norwegian (NO), Polish (PL), Portuguese (PT), Portuguese Brazilian (PT_BR), Romanian (RO), Russian (RU), Slovak (SK), Slovenian (SL), Serbian (SR), Swedish (SV), Swahili (SW), Tamil (TA), Turkish (TR), Ukrainian (UK), Urdu (UR), Vietnamese (VI), Chinese (ZH), Chinese Simplified (ZH_HANS), Chinese Traditional (ZH_HANT)

Available language constants

from palabra_ai import ( # English variants - 1.5+ billion speakers (including L2) EN, EN_AU, EN_CA, EN_GB, EN_US, # Chinese variants - 1.3+ billion speakers ZH, ZH_HANS, ZH_HANT, # ZH_HANS and ZH_HANT for translation only # Hindi & Indian languages - 800+ million speakers HI, BN, MR, TA, UR, # Spanish variants - 500+ million speakers ES, ES_MX, # Arabic variants - 400+ million speakers AR, AR_AE, AR_SA, # French variants - 280+ million speakers FR, FR_CA, # Portuguese variants - 260+ million speakers PT, PT_BR, # Russian & Slavic languages - 350+ million speakers RU, UK, PL, CS, SK, BG, HR, SR, SL, MK, BE, # Japanese & Korean - 200+ million speakers combined JA, KO, # Southeast Asian languages - 400+ million speakers ID, VI, MS, FIL, TH, # Germanic languages - 150+ million speakers DE, NL, SV, NO, DA, IS, # Romance languages (other) - 100+ million speakers IT, RO, CA, GL, # Turkic & Central Asian languages - 200+ million speakers TR, AZ, KK, UG, # Baltic languages - 10+ million speakers LT, LV, ET, # Other European languages - 50+ million speakers EL, HU, FI, EU, CY, MT, # Middle Eastern languages - 50+ million speakers HE, FA, # African languages - 100+ million speakers SW, # Asian languages (other) - 50+ million speakers MN, BA, # Constructed languages EO, IA, # Other languages GA, BS )

Note: Source languages (for speech recognition) and target languages (for translation) have different support. The SDK automatically validates language compatibility when creating SourceLang and TargetLang objects.

Development status

Current status

Core SDK functionality
GitHub Actions CI/CD
Docker packaging
Python 3.11, 3.12, 3.13 support
PyPI publication
Documentation site (coming soon)
Code coverage reporting (setup required)

Current dev roadmap

TODO: global timeout support for long-running tasks
TODO: support for multiple source languages in a single run
TODO: fine cancelling on cancel_all_tasks()
TODO: error handling improvements

Build status

Tests: Running on Python 3.11, 3.12, 3.13
Release: Automated releases with Docker images
Coverage: Tests implemented, reporting setup needed

Requirements

Python 3.11+
Palabra AI API credentials (get them at palabra.ai)

Support

Documentation: https://docs.palabra.ai
Issues: GitHub Issues
Email: info@palabra.ai

License

This project is licensed under the MIT License - see the LICENSE file for details.

Folders and files

Latest commit

History

Repository files navigation

Palabra AI Python SDK

Overview

Installation

From PyPI

macOS SSL Certificate Setup

Quick Start

Real-time microphone translation

Examples

File-to-file translation

Multiple target languages

Customizable output

Transcription output options:

Integrate with FFmpeg (streaming)

Using buffers

Using default audio devices

Async Translation

Synchronous Translation

Signal Handling

Result Handling

I/O Adapters & Mixing

Available adapters

Mixing examples

Microphone to file - record translations

File to speaker - play translations

Microphone to multiple outputs

Buffer to buffer - for integration

FFmpeg pipe to speaker

Benchmarking

Features

Real-time translation

Voice cloning

Device management

Audio Configuration

Sample Rates by Protocol

WebSocket (WS) Mode

WebRTC Mode

Supported languages

Speech recognition languages (Source)

Translation languages (Target)

Available language constants

Development status

Current status

Current dev roadmap

Build status

Requirements

Support

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 43

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages