Open Source 1.6B Parameters

Redefine Voice Interaction with Nari Labs Dia-1.6B

The world's first open-source TTS model supporting non-linguistic expressions. Clone any voice with just 5 seconds of audio, achieve real-time performance at 40 tokens/sec on consumer GPUs.

Real-time Generation Voice Cloning Open Source
NariLabs Dia-1.6B Voice Synthesis Demo
Live Demo
RTX A4000 Compatible

Nari Labs: Democratizing Voice AI with Dia-1.6B

Where Open-Source Meets Human-Like Speech

Founded by two South Korean undergraduates, Nari Labs is redefining text-to-speech (TTS) technology through Dia-1.6B, a 1.6-billion-parameter open-source model that rivals proprietary giants like ElevenLabs and Google's NotebookLM. Built with zero external funding, this breakthrough leverages Google's TPU Research Cloud and Hugging Face's ZeroGPU grants to deliver studio-grade voice synthesis for developers, creators, and researchers.

Expressive Multi-Speaker Dialogues

Generate lifelike conversations with [S1]/[S2] tags, dynamic emotion adaptation, and non-verbal cues like (laughs) or (coughs).

Zero-Shot Voice Cloning

Replicate any voice from a 5-second audio clip—no fine-tuning required—ideal for personalized audiobooks or gaming NPCs.

Real-Time Efficiency

Runs on consumer GPUs at 40 tokens/sec, with future plans for CPU support and 8-bit quantization to slash VRAM usage by 60%.

Open-Source Edge

Apache 2.0 Licensed with built-in audio watermarking and strict anti-impersonation policies for responsible use.

Advanced Features of Nari Labs

Experience the next generation of voice synthesis technology

Ultra-Expressive Dialogue

Advanced contextual understanding for natural expressions including laughs, breaths, and emotional nuances.

Human-like Expression

Hardware Optimized

Runs on consumer GPUs with just 10GB VRAM. Achieve 40 tokens/sec on NVIDIA A4000.

Performance Focused

Maximum Efficiency

Advanced quantization pipeline targeting 60% VRAM reduction. Energy-aware inference with 1.2x improvement.

Resource Efficient

Open Source DNA

Apache 2.0 licensed for unrestricted commercial use. Modular architecture via Hugging Face integration.

Community Driven

How It Works

Experience Nari Labs studio-quality voice synthesis in three simple steps

1

Input Your Text

Enter your text with optional emotion tags and speaker markers. Support for multiple languages and expressive elements like [laugh], [cry], or [whisper].

Start creating
2

Choose Your Voice

Select from our voice library or upload a 5-second sample for instant voice cloning. Perfect for character dialogues or personalized narration.

Select voice
3

Generate & Export

Watch as Dia-1.6B generates human-like speech in real-time. Export in multiple formats and use our API for seamless integration.

Export audio

Frequently Asked Questions

Learn more about Nari Labs Dia-1.6B voice synthesis technology

Community Feedback

See how Nari Labs Dia-1.6B is transforming voice synthesis worldwide

David Chen - AI Developer

David Chen

AI Developer

"NariLabs' Dia-1.6B has revolutionized our game development pipeline. The real-time voice synthesis and emotion control are incredible. We've integrated it into our NPCs, and the results are beyond expectations."

Sarah Kim - Content Creator

Sarah Kim

Content Creator

"As a YouTuber, NariLabs Dia-1.6B has been a game-changer. The voice cloning feature is incredibly accurate, and the emotional expressions make my content more engaging. Plus, it's all open-source!"

Dr. Michael Park - AI Researcher

Dr. Michael Park

AI Researcher

"NariLabs' commitment to open source is commendable. Dia-1.6B's architecture and performance are impressive. The model's ability to handle non-verbal expressions sets a new standard in TTS research."