Gemini Embedding 2: Our first natively multimodal embedding model — DeepSeek Blog | Neura Market
    Neura MarketNeura Market/DeepSeek
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityDeepSeekDeepSeek
    CoPilotCoPilotStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityTrendingGenerate
    DeepSeekBlogGemini Embedding 2: Our first natively multimodal embedding model
    Back to Blog
    Gemini Embedding 2: Our first natively multimodal embedding model
    ai

    Gemini Embedding 2: Our first natively multimodal embedding model

    Patrick Loeber March 10, 2026
    0 views

    Today we're releasing Gemini Embedding 2, our first fully multimodal embedding model built on the...

    Today we're releasing Gemini Embedding 2, our first fully multimodal embedding model built on the Gemini architecture, in Public Preview via the [Gemini API](https://ai.google.dev/gemini-api/docs/embeddings) and [Vertex AI](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings). Expanding on our previous text-only foundation, Gemini Embedding 2 maps text, images, videos, audio and documents into a single, unified embedding space, and captures semantic intent across over 100 languages. This simplifies complex pipelines and enhances a wide variety of multimodal downstream tasks—from Retrieval-Augmented Generation (RAG) and semantic search to sentiment analysis and data clustering. ### New modalities and flexible output dimensions The model is based on Gemini and leverages its best-in-class multimodal understanding capabilities to create high-quality embeddings across: * **Text:** supports an expansive context of up to 8192 input tokens * **Images:** capable of processing up to 6 images per request, supporting PNG and JPEG formats * **Videos:** supports up to 120 seconds of video input in MP4 and MOV formats * **Audio:** natively ingests and embeds audio data without needing intermediate text transcriptions * **Documents:** directly embed PDFs up to 6 pages long Beyond processing one modality at a time, this model natively understands interleaved input so you can pass multiple modalities of input (e.g., image + text) in a single request. This allows the model to capture the complex, nuanced relationships between different media types, unlocking more accurate understanding of complex, real-world data. ![multimodal input](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zr6oj4yh7x028v19wevs.png) Like our previous embedding models, Gemini Embedding 2 incorporates Matryoshka Representation Learning (MRL), a technique that “nests” information by dynamically scaling down dimensions. This enables flexible output dimensions scaling down from the default 3072 so developers can balance performance and storage costs. We recommend using 3072, 1536, 768 dimensions for highest quality. To see these embeddings in action, try out our lightweight multimodal semantic search [demo](https://findmemedia.lmm.ai/). ### State-of-the-art performance Gemini Embedding 2 doesn't just improve on legacy models. It establishes a new performance standard for multimodal depth, introducing strong speech capabilities and outperforming leading models in text, image, and video tasks. This measurable improvement and unique multimodal coverage give developers exactly what they need for their diverse embedding needs. ![benchmarks](https://storage.googleapis.com/gweb-uniblog-publish-prod/images/gemini-embedding-2-benchmarks.width-1000.format-webp.webp) ### Unlocking deeper meaning for data Embeddings are the technology that power experiences in many Google products. From RAG where embeddings can play a crucial role in context engineering to large-scale data management and classic search/analysis, some of our early access partners are already using Gemini Embedding 2 to unlock high-value multimodal applications: ### Start building today Get started with the Gemini Embedding 2 model through [Gemini API](https://ai.google.dev/gemini-api/docs/embeddings) or [Vertex AI](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings). ```python from google import genai from google.genai import types # For Vertex AI: # PROJECT_ID='<add_here>' # client = genai.Client(vertexai=True, project=PROJECT_ID, location='us-central1') client = genai.Client() with open("example.png", "rb") as f: image_bytes = f.read() with open("sample.mp3", "rb") as f: audio_bytes = f.read() # Embed text, image, and audio result = client.models.embed_content( model="gemini-embedding-2-preview", contents=[ "What is the meaning of life?", types.Part.from_bytes( data=image_bytes, mime_type="image/png", ), types.Part.from_bytes( data=audio_bytes, mime_type="audio/mpeg", ), ], ) print(result.embeddings) ``` Learn how to use the model in our interactive [Gemini API](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Embeddings.ipynb) and [Vertex AI](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/embedding/intro_gemini_embedding.ipynb) Colab notebooks. You can also use it through [LangChain](https://docs.langchain.com/oss/python/integrations/text_embedding/google_generative_ai), [LlamaIndex](https://developers.llamaindex.ai/python/framework/integrations/embeddings/google_genai/), [Haystack](https://haystack.deepset.ai/integrations/google-genai), [Weaviate](https://docs.weaviate.io/weaviate/model-providers/google), [QDrant](https://qdrant.tech/documentation/embeddings/gemini/), [ChromaDB](https://docs.trychroma.com/integrations/embedding-models/google-gemini), and [Vector Search](https://docs.cloud.google.com/vertex-ai/docs/vector-search-2/overview). By bringing semantic meaning to the diverse data around us, Gemini Embedding 2 provides the essential multimodal foundation for the next era of advanced AI experiences. We can't wait to see what you build.

    Tags

    aiembeddinggooglenews

    Comments

    More Blog

    View all
    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠ai

    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠

    Hi everyone! 👋 I’m Tara, a Senior Software Engineer and Consultant. Over the years, I've jumped...

    T
    tworrell
    Local AI Will Save Us All (The Math Says So, Trust Me)ai

    Local AI Will Save Us All (The Math Says So, Trust Me)

    Every few weeks a take goes viral in tech circles making the case for ditching cloud AI and running...

    S
    Sebastian Schürmann
    Lost in the AI Hype, I Started Smallai

    Lost in the AI Hype, I Started Small

    And it helped me get back into tech without drowning TL;DR at the end Coming back to...

    R
    Rohini Gaonkar
    Building a Replay-Tested Interactive Brokers Client in Gogo

    Building a Replay-Tested Interactive Brokers Client in Go

    I wanted an IBKR library that felt like Go and had testing I could trust. So I wrote one.

    T
    Thomas Marcelis
    Playwright in Pictures: Fully Parallel Modeplaywright

    Playwright in Pictures: Fully Parallel Mode

    Playwright’s fullyParallel mode is often treated as a simple performance switch. In practice, it...

    V
    Vitaliy Potapov
    Designing a CLI for Both Humans and Agentscli

    Designing a CLI for Both Humans and Agents

    Learn how Alpic designed its CLI for both human developers and AI agents — covering tradeoffs like polling, context windows, interactivity, and statelessness.

    J
    Julien Vallini

    Stay up to date

    Get the latest DeepSeek prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for DeepSeek and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.