Saj Speak Pitch Deck
This document is confidential. Please enter the access code.
Confidential

On-Device Voice AI Platform & Neural Codec Communications

Seed Round: $2.5M at $12M pre-money

Inevara Pty Ltd · March 2026 · kane@sajdak.one

The Problem

Voice AI is broken

Cloud dependency is a liability

78% of consumers refuse cloud AI features. BIPA has generated $163M+ in settlements. COPPA 2025 mandates strict children's voice data rules.

The incumbents are retreating

Amazon AVS deprecated. Google Assistant sunsetting. Ford removing Alexa from vehicles. Cerence revenue down 24%. Microsoft Cortana discontinued.

The mid-market has no option

Free open-source requires deep ML expertise. Enterprise vendors start at $6,000+/year. Startups with $50-500/month have nowhere to go.

800M+ devices need a new voice platform. The window is open now.

The Solution

Two businesses. One foundation.

Division 1

Platform (SDK & Server)

On-device voice AI SDK. 10 engines, 65 models, 6 platform bindings. Runs on $0.99 MCUs to cloud servers. Privacy by default.

  • Wake word, VAD, STT, TTS, Intent, Speaker ID, Diarization, Noise Suppression, Voice Cloning, Pipeline
  • Self-hosted server (saj-server) with REST + WebSocket APIs
  • Revenue: SDK subscriptions + OEM per-device licensing
Division 2

Communications (Saj Link)

Secure communications platform. Encrypted text, voice at 1-3 kbps via neural codecs, video, channels, bots. Signal's privacy, Discord's features.

  • Neural codec voice: 20-30x less bandwidth than Opus
  • E2E encrypted text messaging (Signal Protocol + MLS)
  • Revenue: SaaS seats + codec SDK licensing + enterprise contracts

Both divisions share the same Rust core, ONNX inference engine, audio pipeline, and model format. One engineering investment, two markets.

Platform Division

10 engines. Built, tested, shipping.

All implemented in Rust. 106 tests passing. 65 ONNX models trained. This is not a roadmap.

SajWake
Wake Word Detection
0.27ms, 105KB
SajDetect
Voice Activity Detection
Sub-10ms
SajListen
Streaming STT
Zipformer + RNN-T
SajScribe
Batch STT
Whisper (split enc/dec)
SajSpeak
Text-to-Speech
Kokoro/Piper
SajIntent
Speech-to-Intent
Direct slot filling
SajVoice
Speaker Recognition
ECAPA-TDNN
SajWho
Diarization
Who spoke when
SajClean
Noise Suppression
RNNoise/DeepFilterNet
SajClone
Voice Cloning
3-10 sec enrollment
65
ONNX Models
17
Rust Crates
6
Platform Bindings
106
Tests Passing

Unfair Advantage #1

1.09 billion speakers. Zero competitors.

Arabic, Hindi, and Urdu are first-class languages in every engine. Not an afterthought. Built into the foundation.

LanguageSpeakersOn-Device CompetitorsSaj Speak Status
English1.5B+CrowdedFully implemented
Arabic420MZeroWake words trained (MSA + Gulf), STT/TTS ready
Hindi600MZeroWake words trained, STT/TTS pipeline ready
Urdu70M+ZeroWake words trained, shares Arabic pipeline

A hotel in Dubai needs Arabic for guests, Hindi/Urdu for staff, English for international visitors. Saj Speak is the only SDK that serves every person in a Gulf building.

Saudi Arabia investing $100B+ in AI (Project Transcendence). NEOM alone needs 50,000+ voice endpoints. MENA healthcare $14B→$90B by 2034.

Technology

Rust core. ONNX standard. MCU to cloud.

PrincipleImplementation
Rust-firstMemory-safe, no GC, deterministic performance
ONNX standardAll models inspectable with standard tools
Dual inferenceort (native) + tract (WASM)
True offlineNo license server. No phone-home. Faraday cage ready.
Encrypted models.saj format (AES-256-GCM)

Platform Bindings

PlatformTechnologyStatus
RustNativeProduction
PythonPyO3Production
Node.jsnapi-rsProduction
Browserwasm-bindgenProduction
C / iOS / macOScbindgen FFIProduction
AndroidJNIPlanned

Competitive Landscape

Picovoice is the only competitor. We beat them.

DimensionPicovoiceSaj Speak
Revenue~$1.5M ARRPre-revenue
Team16 people1 (founder)
Engines910 + orchestrator + server
Voice CloningNoYes (on-device, 3-10 sec)
Arabic / Hindi / UrduNo / No / NoFirst-class
Pricing gap$0 or $6,000+/yr$0 / $348 / $2,388 / $11,988 / Enterprise
True offlineRequires AccessKey phone-homeFully offline, no license server
REST API / ServerNoYes (saj-server, OpenAI-compatible)
Neural CodecNoYes
Communications PlatformNoYes (Saj Link)

Unfair Advantage #2

Neural codecs: 20-30x less bandwidth

Instead of transmitting audio waveforms, transmit tiny neural tokens and reconstruct voice on the receiving end.

MethodBitrateQuality
Opus (VoIP default)32 kbpsExcellent
Google Lyra V23.2 kbpsGood
Meta EnCodec1.5-6 kbpsGood-Excellent
Mimi (Kyutai)1.1 kbpsHigh
BigCodec1.04 kbpsHigher than ground truth

BigCodec at 1.04 kbps scored higher than ground truth in blind listening tests (MUSHRA 92.33).

32 kbps
Opus (traditional)
1-3 kbps
Neural codec

No B2B neural codec SDK exists. Zero.

Communications Division

Saj Link — Slack × Discord × WhatsApp × Signal

Encrypted Text Messaging

Signal Protocol (X3DH + Double Ratchet) for DMs. MLS (RFC 9420) for groups. Threads, reactions, polls, disappearing messages. Offline-first with CRDT sync.

Neural Codec Voice & Video

1-3 kbps voice calls. Persistent voice channels (Discord-style). SFrame E2E encrypted group calls. Spatial audio. Studio-quality at satellite bandwidth.

AI Features (Built-in)

Live captioning (SajListen). Voice-first search (SajScribe). Meeting summarization. Speaker identification. Noise-cancelled calls (SajClean). Wake word activation.

Spaces & Channels

Organizational units, open/private/voice channels, threading, discovery feed, channel linking.

Bots & Integrations

E2E-aware bot platform. Webhooks, slash commands, marketplace. Developer API with SDKs.

Platforms: iOS, Android (native), Desktop (Tauri), Web (PWA + WASM), CLI

Security & Encryption

Post-quantum ready from day one

LayerProtocol
Key exchange (1:1)X3DH (Signal Protocol)
RatchetingDouble Ratchet
Group messagingMLS (RFC 9420)
Post-quantumKyber ML-KEM + Dilithium (hybrid)
Voice / videoSRTP + DTLS-SRTP + per-packet AEAD
Group callsSFrame (RFC 9605)
At-restAES-256-GCM / ChaCha20-Poly1305
AuthOPAQUE → Passkeys → FIDO2
  • Server never holds plaintext keys
  • Neural codec tokens encrypted per-packet — even 3-byte representations are ciphertext
  • Sealed Sender hides sender from server
  • Hybrid X25519 + Kyber-1024 protects against "harvest now, decrypt later"
  • Zero-knowledge voice channels — server relays only ciphertext
SOC 2 Y2 ISO 27001 Y2 FIPS 140-3 Y3 FedRAMP Y3-4

Market Opportunity

Two markets. Combined TAM: $19-31B

SDK Division: $8.91B

SegmentTAM
Edge AI software (by 2030)$8.91B
Smart speakers / IoT$2.3B
Automotive voice AI$1.8B
Healthcare voice$1.1B

Communications Division: $10-22B

SegmentTAM
CPaaS/RTC platforms$2-5B
Telco (emerging markets HD Voice)$5-10B
Satellite / maritime$500M-1B
Military / defense$1-3B
Gaming / metaverse$500M-1B

The neural codec makes voice possible where it wasn't before: LoRa mesh, BLE hearing aids, satellite pagers, 2G/3G emerging markets. New markets, not just cheaper existing ones.

Business Model

Five SDK revenue channels + comms SaaS

SDK Subscriptions

TierPriceTarget
Community$0Evaluation
Indie$29/moSolo devs
Pro$199/moStartups
Business$999/moGrowth
Enterprise$5-40K/moCustom

OEM Per-Device

$2.00 (1-10K) → $0.75 (10-100K) → $0.30 (100K-1M) → $0.10-0.20 (1M+)

Comms SaaS (Saj Link)

TierPriceTarget
Free$0Consumers
Pro$8/user/moTeams
Business$15/user/moEnterprise
EnterpriseCustomGov/Defense

Codec SDK Licensing

B2B licensing to CPaaS, gaming, telco. Plus: Console SaaS, Marketplace (20-25%), Certification ($5-25K/yr)

Unit Economics

10x infrastructure advantage = profitable free tier

240 KB

Opus per voice minute

12 KB

Neural codec per voice minute

10K

Concurrent calls per 1 Gbps (Opus)

100K

Concurrent calls per 1 Gbps (neural)

MetricOpus (Traditional)Saj Protocol
Server cost per 1M voice minutes~$150~$15
Cost per user/year (infra)~$15~$2
Packet size on wire~110 bytes (RTP)50 bytes

The cost advantage enables a profitable free tier — the thing that makes consumer adoption possible.

Financial Projections

$393K → $5.3M → $33.7M

Year 1
SDK
$393K
Year 2
SDK
Comms
$5.3M
Year 3
SDK $22.7M
Comms $11M
$33.7M
86%
Gross Margin
46%
Y3 EBITDA Margin
$15.5M
Y3 EBITDA
35
Headcount Y3

SDK division generates cash from Month 3. Break-even on seed alone at Month 12-14. Communications division adds 33% of Y3 revenue.

Revenue Breakdown

SDK dominates early. Comms accelerates.

SDK Division

StreamY1Y2Y3
Subscriptions$234K$2.1M$7.7M
OEM$95K$1.5M$10.9M
Console/Mktplace$54K$700K$3.7M
Certification$10K$130K$460K
Total$393K$4.5M$22.7M

Communications Division

StreamY1Y2Y3
Consumer SaaS$500K
Pro/Business$200K$3M
Enterprise/Gov$500K$5M
Codec SDK$100K$2M
API usage$50K$500K
Total$0$850K$11M

By Year 3: SDK = 67% of revenue, Communications = 33%. OEM becomes the largest single stream as the data flywheel improves model quality.

Valuation Trajectory

$12M → $80M → $510M

StageTimingPre-MoneyRaiseTrigger
SeedNow$12M$2.5M SAFEPlatform built, 65 models
Series AMonth 18-24$80M$12M$4-5M ARR, 300%+ growth

Month 36 Enterprise Value

$230M
Conservative (SDK only, 10x)
$510M
Base Case (SDK + Comms, 15x)
$800M
Aggressive (category leader, 20x)

Communications division roughly doubles the company valuation compared to SDK-only. Incremental engineering: 1 crate + platform = ~6 months on existing infrastructure.

The Ask

$2.5M seed at $12M pre-money

Use of Funds

People (6)
55%
$1.38M
Marketing
15%
$375K
Infra
10%
$250K
R&D
10%
$250K
G&A
10%
$250K

Key Hires (in order)

MonthRoleWhy
2DevRel EngineerCommunity, docs, bus factor
3ML EngineerModel quality, Console
4MENA BD LeadArabic monopoly monetization
5Enterprise AEUS/EU pipeline
6Backend EngineerConsole SaaS, billing
8Frontend EngineerConsole UI, marketplace

MENA BD Lead is the highest-ROI hire. The Arabic monopoly is worthless without someone in the region selling it.

Break-even on seed alone at Month 12-14. Series A is for growth acceleration, not survival. Founder retains 82.8% post-seed.

Milestones

Year 1: SDK revenue. Year 2: scale + launch comms. Year 3: lead.

Year 1

M1-2

SDK published to crates.io, PyPI, npm. Developer preview.

M3

First paying customer. Console MVP.

M5

Console launch. $10K MRR.

M6

Home Assistant integration. saj-codec crate complete.

M7-8

First MENA deal signed. $50K MRR.

M10-12

$100K MRR ($1.2M ARR). Saj Link beta. Series A conversations.

Year 2-3

Y2 Q1

Series A closed. First automotive pilot. Saj Link public beta.

Y2 Q2

5+ OEM production deploys. Saj Link v1.0 (messaging + voice).

Y2 Q3-Q4

$1M+ MRR. First government comms contract. Codec SDK licensing.

Y3 Q1-Q2

12+ OEM deploys. Saj Link enterprise launch. SOC 2.

Y3 Q3-Q4

$3M+ MRR. FIPS 140-3. Series B conversations.

Why Believe This

Velocity is the strongest signal

The entire Saj Speak platform — 13 Rust crates, 106 tests, 10 engines, 4 binding targets — was built from first commit to production-ready in 2.5 days.

65

ONNX models trained across 4 languages, 5 generations

7,646

Lines of communications research produced (7 documents)

5

Platform integrations shipping (macOS, Windows, iOS, Android, Web)

Also built (solo)

  • Bella — 191K LOC, 231 tables, 739 endpoints, ECS Fargate
  • SINGULARITY — 15-vertical AI marketplace platform
  • RaiseProof — 66K enriched investor profiles, live Stripe

What capital unlocks

  • The founder has proven they can build 10x faster than normal teams
  • Capital = people + distribution, not proof-of-concept
  • The technology risk is eliminated. This is an execution and market bet.

Your voice stays yours.

$2.5M
Seed Round
$12M
Pre-Money
$510M
Y3 Base Case

Kane Sajdak

Founder & CEO, Inevara Pty Ltd

kane@sajdak.one · sajspeak.com