What it does Engineering Architecture Tech stack Try Relay → ← Back to Lab
Relay
Released · Early Access

An AI phone receptionist,
built end-to-end.

Relay answers business phones in real time — bridging callers to a Gemini-powered voice assistant that takes messages, books appointments, and silently switches languages on the fly. Live in production at relay.sandybrook.io.

Sign up free — Relay phones you within seconds for a five-minute live demo. No credit card.

For the user

What Relay does for a small business.

A solo dentist, a contractor, a pet groomer — they miss calls. Relay picks up instead, holds a natural conversation with the caller, captures what they need, and emails a clean summary. Two modes share one number: external callers get a polite receptionist; the owner gets an open-ended assistant that can search the web and recall their own message history by voice.

Picks up 24/7

Greets callers, takes messages, classifies urgency. No voicemail, no hold music.

Speaks 28 languages

Greets in your chosen language and silently mirrors the caller mid-call.

Books appointments live

Connects to Google Calendar — checks availability and books on the call.

Hands off urgent calls

Live transfer to a real phone, with a private whisper announcing the caller.

Why this matters

A consulting case study, deployed.

Relay is what Sandy Brook DevWorks looks like when given a real product with real constraints — regulated telephony, real-time audio, multi-tenant billing, and zero-downtime infrastructure-as-code. Four engineering surfaces stacked together to ship an AI product that actually picks up the phone.

Real-time voice bridge

Twilio Media Streams ↔ Gemini Live

A long-lived WebSocket per call shuttles raw audio in both directions: G.711 μ-law 8 kHz from Twilio, PCM16 16 kHz to Gemini Live's bidirectional speech endpoint, and back again. Two concurrent System.Threading.Channels loops keep the pipeline lossless under load. Built-in VAD with tuned start/end-of-speech sensitivity, μ-law ↔ PCM16 + 8 kHz ↔ 16 kHz transcoding from scratch (no external codec), and a SHAKEN/STIR carrier-attestation filter that rejects spoofed calls before billable audio ever flows.

WebSockets Twilio Voice Gemini Live VAD SHAKEN/STIR

AI orchestration

Semantic Kernel + tool calling

Mode-gated plugins are registered with the live model at call start: Receptionist gets the Calendar plugin (check availability, book, cancel — with timezone-aware slot sampling); Assistant Mode gets Messages ("what did Sarah call about?") and WebSearch (Gemini text-API with native google_search grounding for live weather, traffic, hours). Receptionist can also conditionally register a RedirectCall tool that emits live TwiML to bridge the caller to a real phone — with a one-line whisper to the dialed party only.

Semantic Kernel Vertex AI Tool Calling Search Grounding Google Calendar

Multi-tenant SaaS

3 services · 19 controllers · ~40 endpoints

Three .NET services with strict boundaries: a Blazor Server BFF for the dashboard, a JWT-auth REST API for tenant operations, and an internal voice service that owns the Twilio SDK. Tenants authenticate via Google OAuth, Firebase email/password, or WebAuthn passkeys; cross-service calls use Google-signed OIDC instead of shared secrets. Stripe subscriptions managed by Pulumi (catalog) plus webhook-driven self-heal; per-line config overrides cascade from global tenant settings; phone-number lifecycle (provision, hold, resume, release) is fully reversible from the dashboard.

Blazor Server ASP.NET Core JWT + OIDC WebAuthn Stripe Firestore

Production-grade infrastructure

Pulumi · GCP · OpenTelemetry

Everything is code. Pulumi (C#) provisions Cloud Run services, Firestore composite indexes, Cloud Tasks queues, Cloud Scheduler jobs, KMS keyrings, Stripe products, Secret Manager secrets, and the IAM triangles required for OIDC service-to-service auth. Post-call enrichment runs on Cloud Tasks; daily-summary fan-out runs on Cloud Scheduler. Calendar OAuth refresh tokens are envelope-encrypted via Cloud KMS before persistence. Custom OpenTelemetry instrumentation around the Gemini Live session exports gemini.session.duration, gemini.turn.latency, and gemini.interruptions via an otelcol-google sidecar to Cloud Trace + Cloud Monitoring.

Pulumi IaC Cloud Run Cloud Tasks Cloud KMS OpenTelemetry Secret Manager
Architecture

Three services, two data planes.

The voice plane is hot — sub-second audio streaming end-to-end. The dashboard plane is transactional — JWT, REST, and reads off the same Firestore that the voice plane writes. Service-to-service hops are OIDC-signed, never shared secrets.

┌──────────┐                    ┌─────────────┐
│  Caller  │  PSTN call      │   Twilio    │
└──────────┘ ─────────────────▶ └─────────────┘
                                         HTTP webhook
                                         + WSS audio
                                       ▼
              ┌────────────────────────────┐
              │  Twilio.API (Cloud Run)    │   ◀───── Gemini Live (Vertex AI)
              │  ─ voice bridge            │          bidirectional WSS
              │  ─ SK plugins (tools)      │
              │  ─ post-call enrichment    │   ─────▶ Cloud Tasks ─▶ self
              └────────────────────────────┘           (transcribe, summarize,
                                                 email, SMS)
                         writes
                       ▼
              ┌────────────────────────────┐
              │         Firestore          │   ◀──── Relay.API (Cloud Run)
              └────────────────────────────┘          ─ JWT REST API
                                                  ▲
                                                    OIDC
                                                    bearer
                                                  ▼
                                          ┌────────────────┐
                                          │   Relay.Web    │
                                          │   Blazor BFF   │
                                          │  Tenant signs  │
                                          │  in here ──┐   │
                                          └────────────│───┘
                                                       
                                                Google OAuth
                                                Firebase Auth
                                                WebAuthn
Voice plane

Sub-second audio. WebSockets. Tuned VAD. Built-in spam filter.

Dashboard plane

JWT REST. Tenant-scoped queries. Stripe-driven plan limits.

Async plane

Cloud Tasks for enrichment. Scheduler for digests. Idempotent.

Tech stack

What it's built with.

.NET 10

ASP.NET Core, Blazor Server

Vertex AI Gemini

Live Audio + Text + Search grounding

Semantic Kernel

Plugins, tool calling

Twilio

Media Streams, TwiML, 10DLC

GCP Cloud Run

3 services, autoscaling

Firestore

Multi-tenant document store

Cloud Tasks + Scheduler

Async + cron, OIDC-gated

Cloud KMS

OAuth token envelope encryption

Pulumi (C#)

All infra as code

OpenTelemetry

Custom Gemini metrics + traces

Stripe

Subscriptions + metered billing

Identity Platform

Firebase Auth + WebAuthn

For potential customers

Try Relay live.

Sign up free, pick a language, and Relay will phone you within seconds for a five-minute live demo of the receptionist on your line.

relay.sandybrook.io
For potential clients

Need something like this?

Sandy Brook DevWorks builds AI-integrated systems end-to-end — from real-time audio pipelines to multi-tenant SaaS to cloud infrastructure. Relay is one of them.

Talk to Sandy Brook