Seamless scaling with VPA In-place Pod Resize on GKE β€” CoPilot Blog
    Neura MarketNeura Market/CoPilot
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityCoPilotCoPilot
    DeepSeekDeepSeekStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityPluginsTrendingGenerate
    CoPilotBlogSeamless scaling with VPA In-place Pod Resize on GKE
    Back to Blog
    Seamless scaling with VPA In-place Pod Resize on GKE
    kubernetes

    Seamless scaling with VPA In-place Pod Resize on GKE

    Olivier Bourgeois May 20, 2026
    0 views

    Learn how VPA In-place Pod Resize can help seamlessly vertically scale workloads on Google Kubernetes Engine (GKE).

    --- title: Seamless scaling with VPA In-place Pod Resize on GKE published: true description: Learn how VPA In-place Pod Resize can help seamlessly vertically scale workloads on Google Kubernetes Engine (GKE). tags: kubernetes, ai, gke, googlecloud cover_image: https://dev-to-uploads.s3.amazonaws.com/uploads/articles/uqzknnjyuuueceq6xotm.png # Use a ratio of 100:42 for best results. # published_at: 2026-05-20 20:24 +0000 --- Right-sizing Kubernetes workloads is a common platform engineering challenge. Set your requests too high, and you burn cloud budgets on idle capacity; set your limits too low, and your applications face throttling or dreaded OOMKills. For years, the [**Vertical Pod Autoscaler (VPA)**](https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler?utm_campaign=CDR_0x5723eddc_default_b464422378&utm_medium=external&utm_source=blog) has been the standard answer to this problem, automatically adjusting CPU and memory requirements based on actual usage. However, this method of scaling came with a significant catch that prevented widespread adoption for critical workloads: applying new resource parameters required evicting and restarting the pod. This disruption was often unacceptable for stateful applications, long-running connections, or latency-sensitive services. ## Introducing In-place Pod Resize (IPPR) on GKE [**In-place Pod Resize (IPPR)**](https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler?utm_campaign=CDR_0x5723eddc_default_b464422378&utm_medium=external&utm_source=blog#inplaceorrecreate_mode) changes the game by allowing Kubernetes to modify resource requests and limits on live, running containers directly through the underlying container runtime, *without* triggering a restart. By combining the intelligence of VPA with the non-disruptive nature of IPPR, GKE users finally have a viable path to dynamic, seamless, and automated right-sizing. *Note: As of writing, VPA IPPR is in Preview on GKE. While it is a massive step forward, I recommend evaluating it in staging environments before rolling it out to production workloads.* ## Getting started with IPPR To use In-place Pod Resize, you need a [GKE cluster](https://docs.cloud.google.com/kubernetes-engine/docs/concepts/choose-cluster-mode?utm_campaign=CDR_0x5723eddc_default_b464422378&utm_medium=external&utm_source=blog) running version **1.34.0-gke.2201000 or later**. * **GKE Autopilot:** VPA is enabled by default. * **GKE Standard:** Requires the Vertical Pod Autoscaling feature to be enabled. ### 1. Enable the feature If you aren't using Autopilot, ensure your cluster is created or updated with the necessary feature flags: ```shell gcloud container clusters create CLUSTER_NAME \ --project=PROJECT_ID \ --location=us-east1 \ --release-channel=rapid \ --enable-vertical-pod-autoscaling ``` ### 2. Define your VPA object Create a `VerticalPodAutoscaler` resource targeting your Deployment or StatefulSet. The crucial element here is setting `spec.updatePolicy.updateMode` to `InPlaceOrRecreate`. ```yaml apiVersion: "autoscaling.k8s.io/v1" kind: "VerticalPodAutoscaler" metadata: name: "my-vpa" spec: targetRef: apiVersion: "apps/v1" kind: "Deployment" name: "my-deployment" updatePolicy: updateMode: "InPlaceOrRecreate" ``` ### 3. Watch it scale Apply the resource to your cluster and monitor your application under load. Instead of watching Pods terminate and recreate, you can watch the resources modify live using `kubectl describe`. ```shell kubectl describe pod POD_NAME ``` Look for the *AllocatedResources* field or check the events section. You will see the requests change in real-time to match the VPA recommendations, while the *Restart Count* remains exactly the same. **The "Or Recreate" Fallback:** Keep in mind that physics still apply. If VPA recommends a resource size that exceeds the remaining capacity of the Node your Pod is currently running on, an in-place resize is impossible. In this scenario, VPA will fall back to evicting and recreating the Pod so it can be scheduled onto a larger or emptier Node. ## Ready to dive deeper? While this introduction covers the basics of IPPR, right-sizing is just one part of a robust scaling strategy. Implementing VPA often goes hand-in-hand with horizontal scaling and cluster autoscaling. Check out the guide to master scaling on GKE: [Run full-stack workloads at scale on GKE](https://cloud.google.com/kubernetes-engine/docs/tutorials/full-stack-scale?utm_campaign=CDR_0x5723eddc_default_b464422378&utm_medium=external&utm_source=blog).

    Tags

    kubernetesaigkegooglecloud

    Comments

    More Blog

    View all
    Minimalist EKS: The Easy Waykubernetes

    Minimalist EKS: The Easy Way

    Amazon EKS manages the Kubernetes control plane, but you remain responsible for provisioning the...

    J
    Joaquin Menchaca
    Never forget to enter the Stern Grove lottery again!ai

    Never forget to enter the Stern Grove lottery again!

    Browser automation with Playwright, Python, GitHub Actions, and Entire to auto-enter San Francisco Stern Grove concert lotteries each week!

    L
    Lizzie Siegle
    A Free Screenshot Editor That Never Uploads Your Imagetypescript

    A Free Screenshot Editor That Never Uploads Your Image

    A free screenshot and image editor that runs entirely in your browser. Keeping every edit reversible and handling big phone photos, in plain TypeScript and Canvas2D.

    M
    Martin Stark
    I built a CLI to break my highlights out of Apple Booksshowdev

    I built a CLI to break my highlights out of Apple Books

    A macOS CLI + MCP server that exports Apple Books highlights to Markdown and gives AI assistants direct access to your reading notes.

    A
    Andrey Korchak
    A Developer's Guide to Agent Hooks in Antigravity CLIai

    A Developer's Guide to Agent Hooks in Antigravity CLI

    Motivation To be quite honest, "Hooks"β€”the shell commands we trigger at specific points...

    T
    Tanaike
    Tactical vs. Strategic Agentic AI Development β€” A Playbook for Developersagents

    Tactical vs. Strategic Agentic AI Development β€” A Playbook for Developers

    The Strategic Engineer: Why Writing Code Is No Longer Your Most Valuable Skill ...

    A
    Adewumi Saheed Adewale

    Stay up to date

    Get the latest CoPilot prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for CoPilot and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    Β© 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.