How I Built Niobe: An AI Waitress with Gemini Live and Google Cloud — DeepSeek Blog | Neura Market
    Neura MarketNeura Market/DeepSeek
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityDeepSeekDeepSeek
    CoPilotCoPilotStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityTrendingGenerate
    DeepSeekBlogHow I Built Niobe: An AI Waitress with Gemini Live and Google Cloud
    Back to Blog
    How I Built Niobe: An AI Waitress with Gemini Live and Google Cloud
    devchallenge

    How I Built Niobe: An AI Waitress with Gemini Live and Google Cloud

    Providence Ifeosame March 16, 2026
    0 views

    Disclaimer: I created this blog post for the purposes of entering the Gemini Live Agent Challenge...

    **Disclaimer:** *I created this blog post for the purposes of entering the Gemini Live Agent Challenge hackathon. When sharing on social media, use the hashtag #GeminiLiveAgentChallenge.* --- ## What is Niobe? **Niobe** is an AI waitress for restaurants. Restaurant owners upload their menu (even as a photo), get a shareable link, and customers can talk to the waitress by voice—ask about the menu, place orders, and have a natural conversation. No app install required; everything runs in the browser. I built it using **Google’s Gemini** (including the **Gemini Live API** for real-time voice) and **Google Cloud** for deployment. Here’s how I put it together. --- ## Two places where Gemini powers the product I use Gemini in two different ways: ### 1. Gemini API: turning menu images into structured data Restaurant owners don’t have to type their menu. They upload images (photos of the menu, PDFs, etc.). The **Laravel** backend sends those images to the **Gemini API** (via the Laravel AI package with the Gemini driver). Gemini returns structured text and JSON—dishes, categories, prices—and I store that in **PostgreSQL**. So the “brain” of the waitress (menu + context) is partly built by Gemini from images. This is configured in Laravel with `GEMINI_API_KEY` and the default provider in `config/ai.php`: ```php // config/ai.php 'default' => 'gemini', 'default_for_images' => 'gemini', 'providers' => [ 'gemini' => [ 'driver' => 'gemini', 'key' => env('GEMINI_API_KEY'), ], // ... ], ``` Menu extraction then calls Gemini with a prompt and the uploaded image attachments; the response is parsed as JSON and saved to the database. ### 2. Gemini Live API: the voice conversation The real-time voice experience is powered by the **Gemini Live API**. My **Go voice agent** doesn’t do speech-to-text or text-to-speech itself; it acts as a **proxy** between the browser and Gemini Live: - The customer opens the “Talk” page and connects via **WebSocket** to the Go service (`/live?niobe=<slug>`). - The Go agent loads the waitress and menu from the **same PostgreSQL database** Laravel uses, builds a system instruction and tool definitions, and opens a **Gemini Live** session using the Google GenAI SDK. - Audio flows both ways: browser ↔ Go agent ↔ Gemini Live. The model speaks and listens in real time, with natural turn-taking and interrupt handling. When the model decides to take an action (e.g. “place order”), it sends a **tool call** to the agent. The Go service runs **LocalNiobeTools**: it writes to the database (e.g. `waitress_action_logs`), can send email or fire webhooks, and returns the result back to Gemini. The model then confirms to the user in voice. So: **Gemini Live** = voice + reasoning; **Go + PostgreSQL** = tools and persistence. The Go agent connects to Gemini Live with the Google GenAI SDK and wires up system instruction, tools, and audio config: ```go // agent/live/google.go (simplified) client, _ := genai.NewClient(ctx, &genai.ClientConfig{ APIKey: cfg.GetAPIKey(), HTTPOptions: httpOpts, }) model := "gemini-2.5-flash-native-audio-preview-12-2025" connectConfig := &genai.LiveConnectConfig{ ResponseModalities: []genai.Modality{genai.ModalityAudio}, SpeechConfig: &genai.SpeechConfig{ VoiceConfig: &genai.VoiceConfig{ PrebuiltVoiceConfig: &genai.PrebuiltVoiceConfig{VoiceName: "Aoede"}, }, }, SystemInstruction: &genai.Content{ Parts: []*genai.Part{genai.NewPartFromText(systemInstruction)}, Role: genai.RoleUser, }, Tools: tools, } sess, _ := client.Live.Connect(ctx, model, connectConfig) ``` From there, a proxy bridges the browser WebSocket and this session so audio and tool calls flow both ways. --- ## Why Google Cloud? I run the app on **Google Cloud** so that the Laravel app and the Go agent can share one **Cloud SQL (PostgreSQL)** instance. The agent is deployed as a container (e.g. **Cloud Run**), and the web app is deployed via **Terraform** and **Cloud Build** in the `deploy/` folder. Same VPC and database mean low latency and a single source of truth for menus, waitresses, and action logs. Configuration is via environment variables—no secrets in code: ```bash # Agent (Go) – example .env GEMINI_API_KEY=your_key DATABASE_URL=postgres://user:password@/niobe?host=/cloudsql/PROJECT:REGION:INSTANCE PORT=8080 APP_URL=https://your-laravel-app.run.app # Optional: use Vertex AI instead of Gemini API # GOOGLE_GENAI_USE_VERTEXAI=true ``` --- ## Architecture in one picture - **Browser** → **Laravel** (HTTPS) for dashboard and menu upload; Laravel uses **Gemini API** for menu extraction and writes to **PostgreSQL**. - **Browser** → **Go agent** (WebSocket) for voice; the agent reads from **PostgreSQL**, talks to **Gemini Live API**, and runs tools (DB, email, webhooks) in-process. So: one database, two Gemini touchpoints (Gemini API for menus, Gemini Live for voice), and Google Cloud to host and connect it all. --- ## What I’d do next I’d add more tool types (e.g. table reservation, kitchen display), support for more languages, and tighter integration with POS systems. The current stack—Laravel + Vue/Inertia for the app, Go for the voice proxy, Gemini for vision and live voice, and Google Cloud for deployment—gives me a clear path to scale. If you want to see the code or run it yourself, check out the repo and the [architecture doc](https://github.com/your-org/niobe-project/blob/main/docs/ARCHITECTURE.md) for diagrams and data flows. --- *This post was created for the purposes of entering the Gemini Live Agent Challenge hackathon. #GeminiLiveAgentChallenge*

    Tags

    devchallengegeminigooglecloudshowdev

    Comments

    More Blog

    View all
    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠ai

    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠

    Hi everyone! 👋 I’m Tara, a Senior Software Engineer and Consultant. Over the years, I've jumped...

    T
    tworrell
    Local AI Will Save Us All (The Math Says So, Trust Me)ai

    Local AI Will Save Us All (The Math Says So, Trust Me)

    Every few weeks a take goes viral in tech circles making the case for ditching cloud AI and running...

    S
    Sebastian Schürmann
    Lost in the AI Hype, I Started Smallai

    Lost in the AI Hype, I Started Small

    And it helped me get back into tech without drowning TL;DR at the end Coming back to...

    R
    Rohini Gaonkar
    Building a Replay-Tested Interactive Brokers Client in Gogo

    Building a Replay-Tested Interactive Brokers Client in Go

    I wanted an IBKR library that felt like Go and had testing I could trust. So I wrote one.

    T
    Thomas Marcelis
    Playwright in Pictures: Fully Parallel Modeplaywright

    Playwright in Pictures: Fully Parallel Mode

    Playwright’s fullyParallel mode is often treated as a simple performance switch. In practice, it...

    V
    Vitaliy Potapov
    Designing a CLI for Both Humans and Agentscli

    Designing a CLI for Both Humans and Agents

    Learn how Alpic designed its CLI for both human developers and AI agents — covering tradeoffs like polling, context windows, interactivity, and statelessness.

    J
    Julien Vallini

    Stay up to date

    Get the latest DeepSeek prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for DeepSeek and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.