On-Device Voice AI Platform & Neural Codec Communications
Seed Round: $2.5M at $12M pre-money
Inevara Pty Ltd · March 2026 · kane@sajdak.one
78% of consumers refuse cloud AI features. BIPA has generated $163M+ in settlements. COPPA 2025 mandates strict children's voice data rules.
Amazon AVS deprecated. Google Assistant sunsetting. Ford removing Alexa from vehicles. Cerence revenue down 24%. Microsoft Cortana discontinued.
Free open-source requires deep ML expertise. Enterprise vendors start at $6,000+/year. Startups with $50-500/month have nowhere to go.
800M+ devices need a new voice platform. The window is open now.
On-device voice AI SDK. 10 engines, 65 models, 6 platform bindings. Runs on $0.99 MCUs to cloud servers. Privacy by default.
Secure communications platform. Encrypted text, voice at 1-3 kbps via neural codecs, video, channels, bots. Signal's privacy, Discord's features.
Both divisions share the same Rust core, ONNX inference engine, audio pipeline, and model format. One engineering investment, two markets.
All implemented in Rust. 106 tests passing. 65 ONNX models trained. This is not a roadmap.
Arabic, Hindi, and Urdu are first-class languages in every engine. Not an afterthought. Built into the foundation.
| Language | Speakers | On-Device Competitors | Saj Speak Status |
|---|---|---|---|
| English | 1.5B+ | Crowded | Fully implemented |
| Arabic | 420M | Zero | Wake words trained (MSA + Gulf), STT/TTS ready |
| Hindi | 600M | Zero | Wake words trained, STT/TTS pipeline ready |
| Urdu | 70M+ | Zero | Wake words trained, shares Arabic pipeline |
A hotel in Dubai needs Arabic for guests, Hindi/Urdu for staff, English for international visitors. Saj Speak is the only SDK that serves every person in a Gulf building.
Saudi Arabia investing $100B+ in AI (Project Transcendence). NEOM alone needs 50,000+ voice endpoints. MENA healthcare $14B→$90B by 2034.
| Principle | Implementation |
|---|---|
| Rust-first | Memory-safe, no GC, deterministic performance |
| ONNX standard | All models inspectable with standard tools |
| Dual inference | ort (native) + tract (WASM) |
| True offline | No license server. No phone-home. Faraday cage ready. |
| Encrypted models | .saj format (AES-256-GCM) |
| Platform | Technology | Status |
|---|---|---|
| Rust | Native | Production |
| Python | PyO3 | Production |
| Node.js | napi-rs | Production |
| Browser | wasm-bindgen | Production |
| C / iOS / macOS | cbindgen FFI | Production |
| Android | JNI | Planned |
| Dimension | Picovoice | Saj Speak |
|---|---|---|
| Revenue | ~$1.5M ARR | Pre-revenue |
| Team | 16 people | 1 (founder) |
| Engines | 9 | 10 + orchestrator + server |
| Voice Cloning | No | Yes (on-device, 3-10 sec) |
| Arabic / Hindi / Urdu | No / No / No | First-class |
| Pricing gap | $0 or $6,000+/yr | $0 / $348 / $2,388 / $11,988 / Enterprise |
| True offline | Requires AccessKey phone-home | Fully offline, no license server |
| REST API / Server | No | Yes (saj-server, OpenAI-compatible) |
| Neural Codec | No | Yes |
| Communications Platform | No | Yes (Saj Link) |
Instead of transmitting audio waveforms, transmit tiny neural tokens and reconstruct voice on the receiving end.
| Method | Bitrate | Quality |
|---|---|---|
| Opus (VoIP default) | 32 kbps | Excellent |
| Google Lyra V2 | 3.2 kbps | Good |
| Meta EnCodec | 1.5-6 kbps | Good-Excellent |
| Mimi (Kyutai) | 1.1 kbps | High |
| BigCodec | 1.04 kbps | Higher than ground truth |
BigCodec at 1.04 kbps scored higher than ground truth in blind listening tests (MUSHRA 92.33).
No B2B neural codec SDK exists. Zero.
Signal Protocol (X3DH + Double Ratchet) for DMs. MLS (RFC 9420) for groups. Threads, reactions, polls, disappearing messages. Offline-first with CRDT sync.
1-3 kbps voice calls. Persistent voice channels (Discord-style). SFrame E2E encrypted group calls. Spatial audio. Studio-quality at satellite bandwidth.
Live captioning (SajListen). Voice-first search (SajScribe). Meeting summarization. Speaker identification. Noise-cancelled calls (SajClean). Wake word activation.
Organizational units, open/private/voice channels, threading, discovery feed, channel linking.
E2E-aware bot platform. Webhooks, slash commands, marketplace. Developer API with SDKs.
Platforms: iOS, Android (native), Desktop (Tauri), Web (PWA + WASM), CLI
| Layer | Protocol |
|---|---|
| Key exchange (1:1) | X3DH (Signal Protocol) |
| Ratcheting | Double Ratchet |
| Group messaging | MLS (RFC 9420) |
| Post-quantum | Kyber ML-KEM + Dilithium (hybrid) |
| Voice / video | SRTP + DTLS-SRTP + per-packet AEAD |
| Group calls | SFrame (RFC 9605) |
| At-rest | AES-256-GCM / ChaCha20-Poly1305 |
| Auth | OPAQUE → Passkeys → FIDO2 |
| Segment | TAM |
|---|---|
| Edge AI software (by 2030) | $8.91B |
| Smart speakers / IoT | $2.3B |
| Automotive voice AI | $1.8B |
| Healthcare voice | $1.1B |
| Segment | TAM |
|---|---|
| CPaaS/RTC platforms | $2-5B |
| Telco (emerging markets HD Voice) | $5-10B |
| Satellite / maritime | $500M-1B |
| Military / defense | $1-3B |
| Gaming / metaverse | $500M-1B |
The neural codec makes voice possible where it wasn't before: LoRa mesh, BLE hearing aids, satellite pagers, 2G/3G emerging markets. New markets, not just cheaper existing ones.
| Tier | Price | Target |
|---|---|---|
| Community | $0 | Evaluation |
| Indie | $29/mo | Solo devs |
| Pro | $199/mo | Startups |
| Business | $999/mo | Growth |
| Enterprise | $5-40K/mo | Custom |
$2.00 (1-10K) → $0.75 (10-100K) → $0.30 (100K-1M) → $0.10-0.20 (1M+)
| Tier | Price | Target |
|---|---|---|
| Free | $0 | Consumers |
| Pro | $8/user/mo | Teams |
| Business | $15/user/mo | Enterprise |
| Enterprise | Custom | Gov/Defense |
B2B licensing to CPaaS, gaming, telco. Plus: Console SaaS, Marketplace (20-25%), Certification ($5-25K/yr)
Opus per voice minute
Neural codec per voice minute
Concurrent calls per 1 Gbps (Opus)
Concurrent calls per 1 Gbps (neural)
| Metric | Opus (Traditional) | Saj Protocol |
|---|---|---|
| Server cost per 1M voice minutes | ~$150 | ~$15 |
| Cost per user/year (infra) | ~$15 | ~$2 |
| Packet size on wire | ~110 bytes (RTP) | 50 bytes |
The cost advantage enables a profitable free tier — the thing that makes consumer adoption possible.
SDK division generates cash from Month 3. Break-even on seed alone at Month 12-14. Communications division adds 33% of Y3 revenue.
| Stream | Y1 | Y2 | Y3 |
|---|---|---|---|
| Subscriptions | $234K | $2.1M | $7.7M |
| OEM | $95K | $1.5M | $10.9M |
| Console/Mktplace | $54K | $700K | $3.7M |
| Certification | $10K | $130K | $460K |
| Total | $393K | $4.5M | $22.7M |
| Stream | Y1 | Y2 | Y3 |
|---|---|---|---|
| Consumer SaaS | — | — | $500K |
| Pro/Business | — | $200K | $3M |
| Enterprise/Gov | — | $500K | $5M |
| Codec SDK | — | $100K | $2M |
| API usage | — | $50K | $500K |
| Total | $0 | $850K | $11M |
By Year 3: SDK = 67% of revenue, Communications = 33%. OEM becomes the largest single stream as the data flywheel improves model quality.
| Stage | Timing | Pre-Money | Raise | Trigger |
|---|---|---|---|---|
| Seed | Now | $12M | $2.5M SAFE | Platform built, 65 models |
| Series A | Month 18-24 | $80M | $12M | $4-5M ARR, 300%+ growth |
Month 36 Enterprise Value
Communications division roughly doubles the company valuation compared to SDK-only. Incremental engineering: 1 crate + platform = ~6 months on existing infrastructure.
| Month | Role | Why |
|---|---|---|
| 2 | DevRel Engineer | Community, docs, bus factor |
| 3 | ML Engineer | Model quality, Console |
| 4 | MENA BD Lead | Arabic monopoly monetization |
| 5 | Enterprise AE | US/EU pipeline |
| 6 | Backend Engineer | Console SaaS, billing |
| 8 | Frontend Engineer | Console UI, marketplace |
MENA BD Lead is the highest-ROI hire. The Arabic monopoly is worthless without someone in the region selling it.
Break-even on seed alone at Month 12-14. Series A is for growth acceleration, not survival. Founder retains 82.8% post-seed.
SDK published to crates.io, PyPI, npm. Developer preview.
First paying customer. Console MVP.
Console launch. $10K MRR.
Home Assistant integration. saj-codec crate complete.
First MENA deal signed. $50K MRR.
$100K MRR ($1.2M ARR). Saj Link beta. Series A conversations.
Series A closed. First automotive pilot. Saj Link public beta.
5+ OEM production deploys. Saj Link v1.0 (messaging + voice).
$1M+ MRR. First government comms contract. Codec SDK licensing.
12+ OEM deploys. Saj Link enterprise launch. SOC 2.
$3M+ MRR. FIPS 140-3. Series B conversations.
The entire Saj Speak platform — 13 Rust crates, 106 tests, 10 engines, 4 binding targets — was built from first commit to production-ready in 2.5 days.
ONNX models trained across 4 languages, 5 generations
Lines of communications research produced (7 documents)
Platform integrations shipping (macOS, Windows, iOS, Android, Web)
Kane Sajdak
Founder & CEO, Inevara Pty Ltd
kane@sajdak.one · sajspeak.com