Show HN: I built and AI phone system and wrote a step by step instructions

1 hour ago 1

Software Development

Step-by-step tutorial for building an AI voicemail using Twilio Media Streams, FastAPI, OpenAI Realtime API, and Supabase

Why AI Voicemail Systems Matter

In today’s fast-paced world, our phones are constantly busy with customer inquiries, urgent requests, and follow-ups. Traditionally, businesses relied on call centers for large enterprises, receptionists for mid-sized companies, and personal assistants for busy individuals, but AI-powered phone services now offer a reliable and cost-effective alternative. These systems can handle high call volumes, provide immediate customer service, and operate 24/7, ensuring no call goes unanswered even outside office hours or while traveling internationally.

Use Cases:

  • AI front desk
  • Customer support
  • Sales qualification
  • Automated voicemail with transcript and summary

High-Level Architecture Overview

The system consists of four main components:

  1. Twilio Media Streams – Streams live call audio to your server.
  2. FastAPI WebSocket Bridge – Connects Twilio ↔ OpenAI and handles audio conversion.
  3. OpenAI Realtime API – Processes live AI conversation and generates responses.
  4. Supabase – Stores call transcripts, AI summaries, and voicemail data.

Flow Diagram:


Key Components and Technologies

ComponentPurpose
Twilio Media StreamsReal-time call streaming
FastAPIWebSocket server and bridge
OpenAI Realtime APIAI voice conversation and transcription
audioop-ltsAudio format conversion
SupabaseDatabase for transcripts and RAG knowledge base
RAG (Retrieval Augmented Generation)Personalized AI instructions and context

Prerequisites

  1. Accounts & Services

    • Twilio account + phone number
    • OpenAI API key (Realtime API access)
    • Supabase project with tables: calls, call_transcripts, user_settings, knowledge_base
  2. Python Environment

    • Python 3.8+
    • pip package manager
  3. Development Tools

    • ngrok for local testing
    • Terminal/command line access

Step-by-Step Setup Instructions

Step 1: Install Dependencies


Step 2: Configure Environment Variables

Create a .env file:


Step 3: Set Up Supabase Tables

Create the following tables:

  • calls: call metadata
  • call_transcripts: transcripts and summaries
  • user_settings: phone number → user_id
  • agent_prompts: custom prompts for AI
  • knowledge_base: optional RAG chunks

Step 4: Create the Twilio Webhook

  • Endpoint: /api/v1/incoming-call-realtime
  • Returns TwiML connecting call to WebSocket

Step 5: Write the WebSocket Bridge

  • Accepts Twilio WebSocket
  • Connects to OpenAI Realtime API
  • Handles media events: connected, start, media, stop
  • Collects transcripts and forwards audio

RAG schema

RAG Retrieval Flow for AI Instructions


Audio Format Conversion

  • Twilio → OpenAI: μ-law 8kHz → PCM16 24kHz
  • OpenAI → Twilio: PCM16 24kHz → μ-law 8kHz
  • Conversion handled by audioop-lts library

Streaming Conversation Logic

  • Twilio sends media events → converted → OpenAI
  • OpenAI sends response.audio.delta → converted → Twilio
  • AI transcription events collected in real-time
  • Async tasks handle bidirectional streaming

Saving Transcripts to Supabase

  • Collect conversation lines (user + AI)
  • Generate AI summary
  • Save to calls and call_transcripts tables

Testing the System

  • Start FastAPI: uvicorn app.main:app --reload --port 8000
  • Expose with ngrok: ngrok http 8000
  • Update Twilio webhook to ngrok URL
  • Make test calls, verify audio streaming, transcription, and RAG-enhanced responses

What You Can Build With This System

  • AI Receptionist – Answer calls automatically
  • Customer Support Bot – Live issue resolution
  • Sales Qualification Agent – Collect leads
  • AI Voicemail System – Automated greeting, recording, transcript, summary
  • Multi-Tenant SaaS – Custom AI agents per business
  • Internal Helpdesk – HR, IT support
  • Workflow Automation – Trigger notifications or CRM actions

TL;DR

Either read this article or feed IDE of your choice with this context and let it run it for you. Download ready-to-use prompt


By following this guide, you can launch build an AI voice agent capable of handling calls, transcribing them, and generating voicemail summaries automatically. Happy vibing!

Ready to transform your voicemails?

Join forward-thinking businesses using Yadalog to engage customers intelligently.

Read Entire Article