Gemini 3.1 Flash-Lite: Built for intelligence at scale — Gemini Blog | Neura Market
    Neura MarketNeura Market/Gemini
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityDeepSeekDeepSeek
    CoPilotCoPilotStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityGemsExtensionsTrendingGenerate
    GeminiBlogGemini 3.1 Flash-Lite: Built for intelligence at scale
    Back to Blog
    Gemini 3.1 Flash-Lite: Built for intelligence at scale
    gemini

    Gemini 3.1 Flash-Lite: Built for intelligence at scale

    Alisa Fortin March 3, 2026
    0 views

    Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series...

    Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model. Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier. Starting today, 3.1 Flash-Lite is rolling out in preview to developers via the Gemini API in [Google AI Studio](https://aistudio.google.com/prompts/new_chat?model=gemini-3.1-flash-lite-preview) and for enterprises via [Vertex AI](https://console.cloud.google.com/vertex-ai/studio/multimodal?mode=prompt&model=gemini-3.1-flash-lite-preview). ## Cost-efficiency without compromise Priced at just $0.25/1M input tokens and $1.50/1M output tokens, 3.1 Flash-Lite delivers enhanced performance at a fraction of the cost of larger models. It outperforms 2.5 Flash with a 2.5X faster Time to First Answer Token and 45% increase in output speed, according to the [Artificial Analysis benchmark](https://artificialanalysis.ai/) while maintaining similar or better quality. This low latency is needed for high-frequency workflows, making it an ideal model for developers to build responsive, real-time experiences. ![The image shows two bar charts titled "Speed & Cost Efficiency," comparing the "Output speed (higher is better)" and "Price (lower is better)" of Gemini 3.1 Flash-Lite against several other models, including Gemini 2.5 Flash-Lite, GPT-5 mini, Claude 4.5 Haiku, and Grok 4.1 Fast.](https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini-3.1_speed-cost_chart_1.gif) <video src="https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/MMMU_v2.mp4#t=0.001"></video> <center><small>Gemini 3.1 Flash-Lite outperforms 2.5 Flash in speed and quality.</small></center> &nbsp; 3.1 Flash-Lite achieves an impressive Elo score of 1432 on the [Arena.ai Leaderboard](http://arena.ai/) and outperforms other models of similar tier across reasoning and multimodal understanding benchmarks, including 86.9% on GPQA Diamond and 76.8% on MMMU Pro–even surpassing larger Gemini models from prior generations like 2.5 Flash. ![The image displays a comparison table of several AI models, including "Gemini 3.1 Flash-Lite," "Gemini 2.5 Dynamic," "Gemini 2.5 Flash-Lite," "GPT-5 mini," "Claude 4.5 Haiku," and "Grok 4.1 Fast," across various metrics such as input/output price, output speed, and different academic, reasoning, and factual benchmarks.](https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini-3.1-flash-lite-table_1.gif) ## Adaptive intelligence at scale for developers Beyond its raw performance, Gemini 3.1 Flash-Lite comes standard with thinking levels in AI Studio and Vertex AI, giving developers the control and flexibility to select how much the model “thinks” for a task, which is critical for managing high-frequency workloads. 3.1 Flash-Lite can tackle tasks at scale, like high-volume translation and content moderation, where cost is a priority. And it can also handle more complex workloads where more in-depth reasoning is needed, like generating user interfaces and dashboards, creating simulations or following instructions. <video src="https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/CategoryGeneration_v4.mp4#t=0.001"></video> <center><small>3.1 Flash-Lite instantly fills an e-commerce wireframe with hundreds of products in different categories.</small></center> --- <video src="https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/WeatherDashboard_v5.mp4#t=0.001"></video> <center><small>3.1 Flash-Lite can <a href="https://aistudio.google.com/apps/bundled/weather_dashboard_agent">generate dynamic weather dashboards in real-time</a>, using live forecasts and historical data.</small></center> --- <video src="https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/SaasReport_v3.mp4#t=0.001"></video> <center><small> 3.1 Flash-Lite creates a SaaS agent capable of <a href="https://aistudio.google.com/apps/bundled/versatile_execution_agent">executing versatile, multi-step tasks for a business</a>.</small></center> --- <video src="https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/Photo_sorter_Demo_v2_1_small.mp4#t=0.001"></video> <center><small>3.1 Flash-Lite can analyze and sort large numbers of content like images quickly.</small></center> &nbsp; Early-access developers on AI Studio and Vertex AI, and companies like Latitude, Cartwheel and Whering are already using 3.1 Flash-Lite to solve complex problems at scale. Early testers highlighted 3.1 Flash-Lite’s efficiency and reasoning capabilities, saying it can handle complex inputs with the precision of a larger-tier model, plus follow instructions and maintain adherence. ![Quote from Kolby Nottingham at Latitude regarding the instruction-following capabilities and speed of Google's model.](https://storage.googleapis.com/gweb-uniblog-publish-prod/images/3Flash-Lite_Blog_Quote_1.max-1080x1080.format-webp.webp) --- ![](https://storage.googleapis.com/gweb-uniblog-publish-prod/images/3Flash-Lite_Blog_Quote_2.max-1080x1080.format-webp.webp) --- ![](https://storage.googleapis.com/gweb-uniblog-publish-prod/images/3Flash-Lite_Blog_Quote_3.max-1080x1080.format-webp.webp) --- ![](https://storage.googleapis.com/gweb-uniblog-publish-prod/images/3Flash-Lite_Blog_Quote_4.max-1080x1080.format-webp.webp) --- We look forward to seeing what you build with 3.1 Flash-Lite and the rest of the Gemini 3 series models.

    Tags

    geminiaivertexai

    Comments

    More Blog

    View all
    How to prompt Gemini 3.1's new text to speech modelai

    How to prompt Gemini 3.1's new text to speech model

    Gemini 3.1 Flash text to speech (TTS) is a new model that you can direct to get the precise audio...

    F
    fofr
    Building Multimodal Real Time Agent with ADK, Azure AKS, Gemini CLI, and Gemini Flash Live 3.1googleadk

    Building Multimodal Real Time Agent with ADK, Azure AKS, Gemini CLI, and Gemini Flash Live 3.1

    Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build cross cloud...

    X
    xbill
    Building a Multimodal Agent with the ADK, Amazon ECS Express, and Gemini Flash Live 3.1googleadk

    Building a Multimodal Agent with the ADK, Amazon ECS Express, and Gemini Flash Live 3.1

    Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build Agentic apps...

    X
    xbill
    Building a Multimodal Agent with the ADK, Amazon Lightsail, and Gemini Flash Live 3.1python

    Building a Multimodal Agent with the ADK, Amazon Lightsail, and Gemini Flash Live 3.1

    Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build Agentic apps...

    X
    xbill
    Building a Multimodal Cross Cloud Live Agent with ADK, Amazon EKS, and Gemini CLIpython

    Building a Multimodal Cross Cloud Live Agent with ADK, Amazon EKS, and Gemini CLI

    Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build cross cloud...

    X
    xbill
    Building a Multimodal Agent with the ADK, Azure ACA, and Gemini Flash Live 3.1googleadk

    Building a Multimodal Agent with the ADK, Azure ACA, and Gemini Flash Live 3.1

    Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build Agentic apps...

    X
    xbill

    Stay up to date

    Get the latest Gemini prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for Gemini and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.