How we're using Gemini Embeddings to build a smarter, community-driven feed on DEV — CoPilot Blog
    Neura MarketNeura Market/CoPilot
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityCoPilotCoPilot
    DeepSeekDeepSeekStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityPluginsTrendingGenerate
    CoPilotBlogHow we're using Gemini Embeddings to build a smarter, community-driven feed on DEV
    Back to Blog
    How we're using Gemini Embeddings to build a smarter, community-driven feed on DEV
    gemini

    How we're using Gemini Embeddings to build a smarter, community-driven feed on DEV

    Ben Halpern May 20, 2026
    0 views

    Big improvements incoming 👋 Finding the right balance for a feed algorithm is historically really...

    Big improvements incoming 👋 Finding the right balance for a feed algorithm is historically really hard. If you optimize purely for clicks and comments, you end up with a clickbait echo chamber. But if you just sort by recency, it's a firehose where great discussions disappear in hours. We've wrestled with this tension at DEV for a long time. We want a feed that feels alive, but actually surfaces high-quality, intellectually stimulating stuff. So, we're trying something new. We are combining standard community signals—like who you follow and what you react to—with [Gemini Embeddings 2](https://docs.cloud.google.com/gemini-enterprise-agent-platform/models/gemini/embedding-2) and `pgvector`. Here is a look under the hood at how we are putting this together. --- ## 1. Keeping things flexible and auditable Instead of duct-taping API calls all over the codebase, we built a flexible foundation using wrapper classes, mostly centered around `Ai::Base` and `Ai::Embedding`. When a service needs the API, it just passes `wrapper: self` to the client. This lets `Ai::Base` look at the calling object, grab its class name, and check its `VERSION`. ```ruby Ai::Base.new(wrapper: self) ``` This pattern gives us a really clean audit trail via our `AiAudit` model. Every single time we generate a vector or analyze a trend, we automatically log the model used, the caller's class, payloads, latency, and token counts. It makes debugging and tracking costs so much easier, without muddying up our core business logic. --- ## 2. A more personalized feed Our main feed is powered by `FeedConfig`. It compiles custom SQL to score and rank articles for you. Historically, this was all hardcoded math based on things like tags and whether you follow the author. Now, we've introduced a semantic feedback loop. As you interact with the platform, we compile a dynamic `interest_embedding` that represents what you actually care about. We use the `pgvector` extension in PostgreSQL to inject your interests directly into the SQL query: ```sql ( CASE WHEN articles.semantic_embedding IS NOT NULL AND articles.published_at >= :published_since THEN (1 - (articles.semantic_embedding <=> :interest_embedding)) * :semantic_similarity_weight ELSE 0 END ) ``` By using `1 - (embedding <=> user_interest)`, we get a cosine similarity score. We scale that up and mix it in with standard social signals (like who you follow), post quality, and time decay. This means a highly relevant post can rise to the top of your feed, but so can a globally trending post from a community member you love. It’s all about balance. --- ## 3. What the heck is an embedding anyway? (And why v2 matters) If you're new to the concept, an embedding is basically taking a piece of content—like an article text—and turning it into a long string of numbers (a vector). These numbers map the content into a "semantic space." If two posts are talking about the exact same conceptual ideas, their numbers will look very similar mathematically, even if they use completely different wording. We've upgraded this pipeline to use Google's newly released **Gemini Embeddings 2** model. A standard text embedding model only looks at words. But Gemini Embeddings 2 compiles into massive 3,072-dimensional vectors and maps everything into a single, unified semantic space. ### Future-proofing for a multi-modal DEV The coolest part about moving to Embeddings 2 is that it isn't just restricted to text. It natively accepts multimodal inputs—meaning text, code, images, audio, and video. Right now, we're using it to analyze written DEV posts. But because the underlying math maps everything into the exact same vector space, we are completely future-proofing our infrastructure. As the DEV platform evolves, we can easily feed images, podcast audio, or video posts into the exact same database architecture[. A user's `interest_embedding` will be able to effortlessly surface an open-source video tutorial or a technical podcast episode based entirely on conceptual relevance, without us needing to rewrite our feed logic from scratch. --- ## 4. Catching nuanced trends 📈 Tags are great for high-level sorting, but they miss the highly specific, timely conversations. If Ruby 3.4 drops, a `#ruby` tag search won't distinguish between a "Hello World" tutorial and a deep debate about the new parser. To fix this, we are in the process of building a clustering service powered by `TrendDetector`. Every 6 hours, a background job runs a Leader Clustering algorithm in pure Ruby: * **Quality first:** We only look at recent articles scoring at least 15 points above our homepage minimum. * **Clustering:** We measure the cosine distance between articles. If a post is close enough (`0.15` or less) to an existing cluster, it joins it. If not, it starts a new one. * **Labeling:** Once a cluster hits 10 or more articles, we ask the Gemini API to label the trend and summarize the core debate. We store all of this in `TrendMembership`, which lets us sort articles in the UI based on how close they are to the core topic. All of this can be tracked via our open source codebase Forem: {% embed https://github.com/forem/forem %} --- ## 5. Putting the community first ❤️ Human curation, both from the broader community and our editorial perspective, is still the backbone of the system. We are using Gemini Embeddings to amplify what the community is already doing. It’s about mixing the raw utility of vector search with the human spirit of developer-voted scores and relationships. We want DEV to be the best place on the internet to share code and talk about software. We think this is a big step in that direction. What do you think? Let me know in the comments. Happy coding!

    Tags

    geminiaigooglecloudpostgres

    Comments

    More Blog

    View all
    Minimalist EKS: The Easy Waykubernetes

    Minimalist EKS: The Easy Way

    Amazon EKS manages the Kubernetes control plane, but you remain responsible for provisioning the...

    J
    Joaquin Menchaca
    Never forget to enter the Stern Grove lottery again!ai

    Never forget to enter the Stern Grove lottery again!

    Browser automation with Playwright, Python, GitHub Actions, and Entire to auto-enter San Francisco Stern Grove concert lotteries each week!

    L
    Lizzie Siegle
    A Free Screenshot Editor That Never Uploads Your Imagetypescript

    A Free Screenshot Editor That Never Uploads Your Image

    A free screenshot and image editor that runs entirely in your browser. Keeping every edit reversible and handling big phone photos, in plain TypeScript and Canvas2D.

    M
    Martin Stark
    I built a CLI to break my highlights out of Apple Booksshowdev

    I built a CLI to break my highlights out of Apple Books

    A macOS CLI + MCP server that exports Apple Books highlights to Markdown and gives AI assistants direct access to your reading notes.

    A
    Andrey Korchak
    A Developer's Guide to Agent Hooks in Antigravity CLIai

    A Developer's Guide to Agent Hooks in Antigravity CLI

    Motivation To be quite honest, "Hooks"—the shell commands we trigger at specific points...

    T
    Tanaike
    Tactical vs. Strategic Agentic AI Development — A Playbook for Developersagents

    Tactical vs. Strategic Agentic AI Development — A Playbook for Developers

    The Strategic Engineer: Why Writing Code Is No Longer Your Most Valuable Skill ...

    A
    Adewumi Saheed Adewale

    Stay up to date

    Get the latest CoPilot prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for CoPilot and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.