About Us

Aqua Voice is building the voice input layer for the age of AI. We train our own models and build deep OS integrations because doing voice well requires controlling the entire stack.

The way people work is changing. IC work is over; you manage AI agents now. This type of work is wonderfully suited for voice.

We aren't building conversational agents. We believe the most natural way to interact with your computer is voice in text out (VITO). We believe that voice belongs at a level above the application, and that a small company can win by pushing the envelope.

We are applying relentless energy to this opportunity and the results so far have been good. We hope you will join us.

The Role

You will own technical systems across the stack. We're a small team, post-product-market-fit, and growing. We have real users and real problems.

We prefer modern tools—Bun over Node, Fly over AWS when it makes sense, PyTorch over legacy ML frameworks.

The Work

GPU inference: We run our own ASR models.

Real-time transcription: WebSocket state machines, multi-provider failover, sub-100ms latency targets. We track 30+ metrics per session.

Native apps: Swift on macOS, C# on Windows, Electron IPC. System-level programming.

Backend: Django for billing and analytics (oops). Celery for background jobs. PostgreSQL.

There are no separate infrastructure and product teams. You will work on all of it.

Stack

- Languages: TypeScript, Python, Swift, C#

- Runtime: Bun, Node.js, Django, FastAPI

- ML: PyTorch

- Infra: Fly.io, Terraform, AWS (RDS), Redis

- Protocols: gRPC, WebSocket, REST

What We're Looking For

- Experience building and operating systems at scale

- Strong opinions about architecture, loosely held

- Comfortable being the most senior technical person in the room

- Able to take vague problems and ship working systems

- Code that survives production