Financial Expressions
Currency symbols, codes, and abbreviations (M, K, $)
The startup secured $5.2M in venture capital,
a huge leap from their initial $450K seed round.
Lightning Fast, On-Device TTS.
Incredibly lightweight and blazingly fast, running natively in your environment via ONNX.
Supertonic can process over 12,000 characters on high-end GPUs,
and up to about 2,500 characters on consumer laptops.
Your generated speech will appear here
Currency symbols, codes, and abbreviations (M, K, $)
The startup secured $5.2M in venture capital,
a huge leap from their initial $450K seed round.
Time and date formats, abbreviated weekdays/months
The train delay was announced at 4:45 PM on Wed, Apr 3, 2024 due to track maintenance.
Area codes, hyphens, extensions (ext.)
You can reach the hotel front desk at (212) 555-0142 ext. 402 anytime.
Numbers with units, abbreviated technical notations
Our drone battery lasts 2.3h when flying at 30kph with full camera payload.
Optimized ONNX Runtime inference delivers speech synthesis at unprecedented speeds. No more waiting.
Minimal footprint means it runs smoothly on any device - from servers to embedded systems.
Complete privacy and zero latency. All processing happens locally - no cloud dependencies.
Seamlessly processes numbers, dates, currency, abbreviations, and complex expressions without pre-processing.
Adjust inference steps, batch processing, and other parameters to match your specific needs.
Deploy seamlessly across servers, browsers, and edge devices with multiple runtime backends.
Presents SupertonicTTS, a highly efficient TTS framework built on flow-matching and ConvNeXt blocks. Features context-sharing batch expansion, character-level processing, and cross-attention alignment without the need for external G2P modules or aligners.
Read PaperโIntroduces LARoPE, an improved position embedding method that enables faster convergence, more accurate alignment, and better stability in extended speech generation up to 30 seconds. Achieves state-of-the-art word error rate on zero-shot TTS benchmarks.
Read PaperโProposes Self-Purifying Flow Matching (SPFM), a principled approach to handle noisy training data. Identifies unreliable samples during training without pretrained models, ensuring accurate conditioning even with label contamination.
Read Paperโ