Interactive Demo

Experience Kani TTS directly in your browser. Generate high-quality speech from any text input and hear the difference.

Try Kani TTS Now

Our interactive demo allows you to test Kani TTS capabilities directly in your browser. Enter any text, adjust parameters, and generate speech in real-time.

Custom text input with example prompts

Parameter adjustment (temperature, top-p, repetition penalty)

Real-time audio generation and playback

Multiple model variants to choose from

Demo Features

Model Selection

Choose between base model (random voices), female voice, or male voice variants

Parameter Control

Adjust temperature (0.1-1.5), top-p (0.1-1), repetition penalty (1-2), and max tokens (100-2000)

Audio Quality

Generate 22kHz high-quality audio with natural intonation and expression

Real-Time Processing

Fast generation with ~1 second processing time for 15-second audio

Live Demo Interface

The demo below is powered by Hugging Face Spaces and runs directly in your browser. No installation required.

KaniTTS: Fast and Expressive Speech Generation Model

Example Prompts to Try

Casual Conversation

"Hello world! My name is Kani, I'm a speech generation model!"

Perfect for testing basic speech generation and natural intonation.

Emotional Expression

"I do believe Marsellus Wallace, MY husband, YOUR boss, told you to take me out and do WHATEVER I WANTED."

Test emotional range and emphasis with dramatic text.

Question and Answer

"What do we say to the god of death? Not today!"

Evaluate question intonation and dramatic delivery.

Humor and Wit

"What do you call a lawyer with an IQ of 60? Your honor"

Test timing and comedic delivery in speech generation.

Complex Dialogue

"You mean, let me understand this cause, you know maybe it's me, it's a little messed up maybe, but I'm funny how, I mean funny like I'm a clown, I amuse you?"

Challenge the model with complex, conversational text.

Technical Content

"Kani TTS is a modular Human-Like TTS Model that generates high-quality speech from text input with 450M parameters."

Test pronunciation of technical terms and proper nouns.

Demo Performance Information

What to Expect

Processing Time

Most text inputs will generate speech within 1-3 seconds, depending on length and complexity. The demo runs on Hugging Face's infrastructure for optimal performance.

Audio Quality

Generated audio maintains 22kHz sample rate with natural intonation, proper pauses, and emotional expression that matches the input text context.

Model Variants

Test different voice characteristics by switching between base, female, and male voice models. Each variant offers unique tonal qualities and speaking styles.

Technical Details

Model ArchitectureLFM2-350M + NanoCodec

Parameters450M

Sample Rate22kHz

Compression0.6kbps

Max Tokens2000

Languages8 Supported

Ready to Integrate Kani TTS?

After trying the demo, explore our installation guide to set up Kani TTS on your own system for production use.

View Installation Guide Read FAQ