The world's first open-source TTS model supporting non-linguistic expressions. Clone any voice with just 5 seconds of audio, achieve real-time performance at 40 tokens/sec on consumer GPUs.
Where Open-Source Meets Human-Like Speech
Founded by two South Korean undergraduates, Nari Labs is redefining text-to-speech (TTS) technology through Dia-1.6B, a 1.6-billion-parameter open-source model that rivals proprietary giants like ElevenLabs and Google's NotebookLM. Built with zero external funding, this breakthrough leverages Google's TPU Research Cloud and Hugging Face's ZeroGPU grants to deliver studio-grade voice synthesis for developers, creators, and researchers.
Generate lifelike conversations with [S1]/[S2] tags, dynamic emotion adaptation, and non-verbal cues like (laughs) or (coughs).
Replicate any voice from a 5-second audio clip—no fine-tuning required—ideal for personalized audiobooks or gaming NPCs.
Runs on consumer GPUs at 40 tokens/sec, with future plans for CPU support and 8-bit quantization to slash VRAM usage by 60%.
Apache 2.0 Licensed with built-in audio watermarking and strict anti-impersonation policies for responsible use.
Experience the next generation of voice synthesis technology
Advanced contextual understanding for natural expressions including laughs, breaths, and emotional nuances.
Runs on consumer GPUs with just 10GB VRAM. Achieve 40 tokens/sec on NVIDIA A4000.
Advanced quantization pipeline targeting 60% VRAM reduction. Energy-aware inference with 1.2x improvement.
Apache 2.0 licensed for unrestricted commercial use. Modular architecture via Hugging Face integration.
Experience Nari Labs studio-quality voice synthesis in three simple steps
Enter your text with optional emotion tags and speaker markers. Support for multiple languages and expressive elements like [laugh], [cry], or [whisper].
Select from our voice library or upload a 5-second sample for instant voice cloning. Perfect for character dialogues or personalized narration.
Watch as Dia-1.6B generates human-like speech in real-time. Export in multiple formats and use our API for seamless integration.
Learn more about Nari Labs Dia-1.6B voice synthesis technology
See how Nari Labs Dia-1.6B is transforming voice synthesis worldwide
AI Developer
"NariLabs' Dia-1.6B has revolutionized our game development pipeline. The real-time voice synthesis and emotion control are incredible. We've integrated it into our NPCs, and the results are beyond expectations."
Content Creator
"As a YouTuber, NariLabs Dia-1.6B has been a game-changer. The voice cloning feature is incredibly accurate, and the emotional expressions make my content more engaging. Plus, it's all open-source!"
AI Researcher
"NariLabs' commitment to open source is commendable. Dia-1.6B's architecture and performance are impressive. The model's ability to handle non-verbal expressions sets a new standard in TTS research."