狼友网址在线播放,中文字幕精品日本

一、概述

1.1 背景介紹

線上業(yè)務(wù)流量存在明顯的波峰波谷。白天高峰期Pod數(shù)量不夠?qū)е抡?qǐng)求排隊(duì)，凌晨低谷期大量Pod空跑浪費(fèi)資源。手動(dòng)擴(kuò)縮容不現(xiàn)實(shí)——你不可能每天早上8點(diǎn)上線加Pod、晚上12點(diǎn)再縮回去。

Kubernetes提供了兩種自動(dòng)伸縮機(jī)制：

HPA（Horizontal Pod Autoscaler）：水平伸縮，根據(jù)指標(biāo)自動(dòng)調(diào)整Pod副本數(shù)

VPA（Vertical Pod Autoscaler）：垂直伸縮，根據(jù)實(shí)際使用量自動(dòng)調(diào)整Pod的CPU/內(nèi)存Request和Limit

兩者解決的問(wèn)題不同。HPA解決"需要多少個(gè)Pod"，VPA解決"每個(gè)Pod需要多少資源"。生產(chǎn)環(huán)境中通常HPA用得更多，VPA更多用于輔助資源規(guī)劃。

1.2 技術(shù)特點(diǎn)

HPA：基于Metrics Server或自定義指標(biāo)（Prometheus Adapter）實(shí)現(xiàn)水平擴(kuò)縮，支持CPU、內(nèi)存、自定義指標(biāo)和外部指標(biāo)四種類型，擴(kuò)縮算法基于desiredReplicas = ceil(currentReplicas * (currentMetricValue / desiredMetricValue))

VPA：由三個(gè)組件構(gòu)成（Recommender、Updater、Admission Controller），通過(guò)分析歷史資源使用數(shù)據(jù)推薦合理的Request值，支持Auto、Recreate、Initial、Off四種更新模式

擴(kuò)縮容穩(wěn)定窗口：HPA v2支持配置stabilizationWindowSeconds，避免指標(biāo)抖動(dòng)導(dǎo)致頻繁擴(kuò)縮。默認(rèn)縮容穩(wěn)定窗口300秒，擴(kuò)容0秒

1.3 適用場(chǎng)景

HPA適用場(chǎng)景：Web服務(wù)、API網(wǎng)關(guān)等無(wú)狀態(tài)服務(wù)的流量波動(dòng)應(yīng)對(duì)；消息隊(duì)列消費(fèi)者根據(jù)隊(duì)列積壓深度自動(dòng)擴(kuò)縮；批處理任務(wù)根據(jù)待處理任務(wù)數(shù)動(dòng)態(tài)調(diào)整Worker數(shù)量

VPA適用場(chǎng)景：新上線服務(wù)不確定資源需求，用VPA的Off模式獲取推薦值后手動(dòng)調(diào)整；Java應(yīng)用JVM內(nèi)存使用模式固定，用VPA自動(dòng)調(diào)整避免OOMKill；長(zhǎng)期運(yùn)行的有狀態(tài)服務(wù)不適合水平擴(kuò)展，通過(guò)垂直擴(kuò)展提升單Pod處理能力

HPA+VPA混合場(chǎng)景：VPA設(shè)為Off模式只提供推薦值，HPA負(fù)責(zé)實(shí)際擴(kuò)縮。兩者同時(shí)以Auto模式運(yùn)行在CPU/內(nèi)存指標(biāo)上會(huì)沖突，生產(chǎn)環(huán)境不要這么干

1.4 環(huán)境要求

組件	版本要求	說(shuō)明
Kubernetes	1.23+	HPA v2 API在1.23 GA，低版本只能用v2beta2
Metrics Server	0.6.0+	HPA依賴它獲取CPU/內(nèi)存指標(biāo)，必裝
VPA	0.14.0+	需要單獨(dú)部署，不包含在K8s默認(rèn)組件中
Prometheus + Adapter	Prometheus 2.40+, Adapter 0.11+	自定義指標(biāo)擴(kuò)縮必須，純CPU/內(nèi)存擴(kuò)縮不需要
集群節(jié)點(diǎn)	建議至少3個(gè)Worker節(jié)點(diǎn)	節(jié)點(diǎn)數(shù)太少HPA擴(kuò)出來(lái)的Pod沒(méi)地方調(diào)度

二、詳細(xì)步驟

2.1 準(zhǔn)備工作

2.1.1 確認(rèn)集群狀態(tài)

# 確認(rèn)K8s版本，HPA v2需要1.23+
kubectl version --short

# 檢查是否已安裝Metrics Server
kubectl get deployment metrics-server -n kube-system

# 如果沒(méi)有安裝Metrics Server，檢查節(jié)點(diǎn)指標(biāo)是否可用
kubectl top nodes
# 如果報(bào)錯(cuò) "Metrics API not available"，說(shuō)明需要安裝Metrics Server

2.1.2 安裝Metrics Server

Metrics Server是HPA的基礎(chǔ)依賴，沒(méi)有它HPA拿不到CPU和內(nèi)存指標(biāo)。

# 下載Metrics Server部署文件
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/components.yaml

# 如果是自簽證書的集群（比如kubeadm搭建的），需要加啟動(dòng)參數(shù)跳過(guò)證書校驗(yàn)
# 編輯Metrics Server的Deployment
kubectl edit deployment metrics-server -n kube-system

在containers.args中添加：

args:
---cert-dir=/tmp
---secure-port=10250
---kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
---kubelet-use-node-status-port
---metric-resolution=15s
# 自簽證書集群必須加這行，否則Metrics Server連不上kubelet
---kubelet-insecure-tls

# 等待Metrics Server就緒
kubectl rollout status deployment metrics-server -n kube-system

# 驗(yàn)證指標(biāo)可用
kubectl top nodes
kubectl top pods -A

注意：--kubelet-insecure-tls只在測(cè)試環(huán)境或自簽證書環(huán)境使用。正式環(huán)境應(yīng)該配置正確的CA證書。

2.1.3 部署測(cè)試應(yīng)用

創(chuàng)建一個(gè)用于測(cè)試HPA的Deployment，必須設(shè)置resources.requests，否則HPA無(wú)法計(jì)算CPU利用率百分比。

# test-app.yaml
apiVersion:apps/v1
kind:Deployment
metadata:
name:php-apache
namespace:default
spec:
replicas:1
selector:
 matchLabels:
  app:php-apache
template:
 metadata:
  labels:
   app:php-apache
 spec:
  containers:
  -name:php-apache
   image:registry.k8s.io/hpa-example
   ports:
   -containerPort:80
   resources:
    requests:
     cpu:200m
     memory:128Mi
    limits:
     cpu:500m
     memory:256Mi
---
apiVersion:v1
kind:Service
metadata:
name:php-apache
namespace:default
spec:
selector:
 app:php-apache
ports:
-port:80
 targetPort:80

kubectl apply -ftest-app.yaml
kubectlwait--for=condition=available deployment/php-apache --timeout=60s

2.2 核心配置

2.2.1 HPA基礎(chǔ)配置——基于CPU指標(biāo)

# hpa-cpu.yaml
apiVersion:autoscaling/v2
kind:HorizontalPodAutoscaler
metadata:
name:php-apache-hpa
namespace:default
spec:
scaleTargetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:php-apache
minReplicas:2
maxReplicas:20
metrics:
-type:Resource
 resource:
  name:cpu
  target:
   type:Utilization
   # 目標(biāo)CPU利用率50%，基于requests計(jì)算
   # Pod requests 200m，實(shí)際用到100m時(shí)就是50%
   averageUtilization:50
behavior:
 scaleUp:
  stabilizationWindowSeconds:0
  policies:
  -type:Percent
   value:100
   periodSeconds:15
  -type:Pods
   value:4
   periodSeconds:15
  selectPolicy:Max
 scaleDown:
  stabilizationWindowSeconds:300
  policies:
  -type:Percent
   value:10
   periodSeconds:60
  selectPolicy:Min

kubectl apply -f hpa-cpu.yaml
kubectl get hpa php-apache-hpa

參數(shù)說(shuō)明：

minReplicas: 2：最小副本數(shù)。生產(chǎn)環(huán)境至少設(shè)為2，設(shè)為1意味著縮容到底只剩一個(gè)Pod，掛了就全掛

maxReplicas: 20：最大副本數(shù)。根據(jù)集群資源容量設(shè)置上限，防止失控?cái)U(kuò)容把節(jié)點(diǎn)資源吃光

averageUtilization: 50：目標(biāo)利用率。設(shè)太低（如20%）會(huì)導(dǎo)致Pod數(shù)量過(guò)多浪費(fèi)資源，設(shè)太高（如90%）留不出余量應(yīng)對(duì)突發(fā)流量。50%是比較穩(wěn)妥的起點(diǎn)

scaleUp.stabilizationWindowSeconds: 0：擴(kuò)容不等待，流量來(lái)了立刻擴(kuò)

scaleDown.stabilizationWindowSeconds: 300：縮容等5分鐘，避免流量短暫下降就縮容，結(jié)果流量又上來(lái)了又要擴(kuò)

behavior策略解讀：

擴(kuò)容策略用Max：取"當(dāng)前副本數(shù)翻倍"和"加4個(gè)Pod"中的較大值，保證突發(fā)流量時(shí)擴(kuò)得夠快

縮容策略用Min：取"縮10%"中的較小值，每分鐘最多縮10%，保證縮容足夠平滑

2.2.2 HPA多指標(biāo)配置——CPU+內(nèi)存+自定義指標(biāo)

# hpa-multi-metrics.yaml
apiVersion:autoscaling/v2
kind:HorizontalPodAutoscaler
metadata:
name:web-app-hpa
namespace:production
spec:
scaleTargetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:web-app
minReplicas:3
maxReplicas:50
metrics:
# CPU指標(biāo)
-type:Resource
 resource:
  name:cpu
  target:
   type:Utilization
   averageUtilization:60
# 內(nèi)存指標(biāo)
-type:Resource
 resource:
  name:memory
  target:
   type:Utilization
   averageUtilization:70
# 自定義指標(biāo)：每個(gè)Pod的QPS（需要Prometheus Adapter）
-type:Pods
 pods:
  metric:
   name:http_requests_per_second
  target:
   type:AverageValue
   averageValue:"1000"
# 外部指標(biāo)：消息隊(duì)列積壓數(shù)（需要Prometheus Adapter）
-type:External
 external:
  metric:
   name:rabbitmq_queue_messages
   selector:
    matchLabels:
     queue:"task-queue"
  target:
   type:AverageValue
   averageValue:"50"

注意：多指標(biāo)HPA的計(jì)算邏輯是取所有指標(biāo)計(jì)算出的期望副本數(shù)的最大值。比如CPU算出需要5個(gè)Pod，QPS算出需要8個(gè)Pod，最終擴(kuò)到8個(gè)。

2.2.3 安裝Prometheus Adapter（自定義指標(biāo)擴(kuò)縮必須）

# 使用Helm安裝Prometheus Adapter
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install prometheus-adapter prometheus-community/prometheus-adapter 
 --namespace monitoring 
 --setprometheus.url=http://prometheus-server.monitoring.svc 
 --setprometheus.port=9090

Prometheus Adapter配置文件，將Prometheus指標(biāo)映射為K8s Custom Metrics API：

# prometheus-adapter-config.yaml
apiVersion:v1
kind:ConfigMap
metadata:
name:prometheus-adapter
namespace:monitoring
data:
config.yaml:|
  rules:
  # 將Prometheus的http_requests_total映射為每秒請(qǐng)求數(shù)
  - seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
   resources:
    overrides:
     namespace: {resource: "namespace"}
     pod: {resource: "pod"}
   name:
    matches: "^(.*)_total$"
    as: "${1}_per_second"
   metricsQuery: 'rate(<<.Series>>{<<.LabelMatchers>>}[2m])'
  # RabbitMQ隊(duì)列深度
  - seriesQuery: 'rabbitmq_queue_messages{queue!=""}'
   resources:
    template: "<<.Resource>>"
   name:
    matches: "^(.*)"
    as: "$1"
   metricsQuery: '<<.Series>>{<<.LabelMatchers>>}'

# 驗(yàn)證自定義指標(biāo)API是否可用
kubectl get --raw"/apis/custom.metrics.k8s.io/v1beta1"| jq .
kubectl get --raw"/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second"| jq .

2.2.4 VPA安裝與配置

# 克隆VPA項(xiàng)目
gitclonehttps://github.com/kubernetes/autoscaler.git
cdautoscaler/vertical-pod-autoscaler

# 安裝VPA（會(huì)部署Recommender、Updater、Admission Controller三個(gè)組件）
./hack/vpa-up.sh

# 驗(yàn)證VPA組件運(yùn)行狀態(tài)
kubectl get pods -n kube-system | grep vpa
# 應(yīng)該看到三個(gè)Pod：
# vpa-admission-controller-xxx  Running
# vpa-recommender-xxx      Running
# vpa-updater-xxx        Running

VPA配置示例：

# vpa-auto.yaml - 自動(dòng)模式，VPA會(huì)自動(dòng)調(diào)整Pod資源
apiVersion:autoscaling.k8s.io/v1
kind:VerticalPodAutoscaler
metadata:
name:php-apache-vpa
namespace:default
spec:
targetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:php-apache
updatePolicy:
 # Auto: 自動(dòng)重建Pod并應(yīng)用推薦值
 # Recreate: 同Auto，只在Pod重建時(shí)應(yīng)用
 # Initial: 只在Pod首次創(chuàng)建時(shí)應(yīng)用
 # Off: 只推薦不執(zhí)行，最安全
 updateMode:"Off"
resourcePolicy:
 containerPolicies:
 -containerName:php-apache
  minAllowed:
   cpu:100m
   memory:64Mi
  maxAllowed:
   cpu:2
   memory:2Gi
  controlledResources:["cpu","memory"]
  controlledValues:RequestsOnly

kubectl apply -f vpa-auto.yaml

# 等待幾分鐘后查看VPA推薦值
kubectl get vpa php-apache-vpa -o yaml

生產(chǎn)環(huán)境建議：VPA的updateMode先用Off，觀察推薦值是否合理，確認(rèn)沒(méi)問(wèn)題后再切到Auto。直接上Auto模式，VPA會(huì)重建Pod來(lái)應(yīng)用新的資源配置，業(yè)務(wù)高峰期Pod被重建可能導(dǎo)致短暫不可用。

2.2.5 HPA與VPA混合使用

HPA和VPA不能同時(shí)基于CPU/內(nèi)存指標(biāo)工作，會(huì)互相打架。正確的混合方式：

# 方案一：VPA用Off模式，只看推薦值，HPA正常工作
# vpa-off-mode.yaml
apiVersion:autoscaling.k8s.io/v1
kind:VerticalPodAutoscaler
metadata:
name:web-app-vpa
namespace:production
spec:
targetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:web-app
updatePolicy:
 updateMode:"Off"
---
# HPA正常配置
apiVersion:autoscaling/v2
kind:HorizontalPodAutoscaler
metadata:
name:web-app-hpa
namespace:production
spec:
scaleTargetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:web-app
minReplicas:3
maxReplicas:30
metrics:
-type:Resource
 resource:
  name:cpu
  target:
   type:Utilization
   averageUtilization:60

# 方案二：VPA管內(nèi)存，HPA基于自定義指標(biāo)（不用CPU/內(nèi)存）
# 這種方式VPA可以用Auto模式
apiVersion:autoscaling.k8s.io/v1
kind:VerticalPodAutoscaler
metadata:
name:web-app-vpa
namespace:production
spec:
targetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:web-app
updatePolicy:
 updateMode:"Auto"
resourcePolicy:
 containerPolicies:
 -containerName:web-app
  controlledResources:["memory"]
  controlledValues:RequestsOnly
---
apiVersion:autoscaling/v2
kind:HorizontalPodAutoscaler
metadata:
name:web-app-hpa
namespace:production
spec:
scaleTargetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:web-app
minReplicas:3
maxReplicas:30
metrics:
# 只用自定義指標(biāo)，不用CPU/內(nèi)存，避免和VPA沖突
-type:Pods
 pods:
  metric:
   name:http_requests_per_second
  target:
   type:AverageValue
   averageValue:"800"

2.3 啟動(dòng)和驗(yàn)證

2.3.1 HPA擴(kuò)容壓測(cè)驗(yàn)證

# 開(kāi)一個(gè)終端持續(xù)觀察HPA狀態(tài)
kubectl get hpa php-apache-hpa -w

# 另開(kāi)一個(gè)終端，用busybox發(fā)壓力
kubectl run -i --tty load-generator --rm --image=busybox:1.36 --restart=Never -- /bin/sh -c 
"while sleep 0.01; do wget -q -O- http://php-apache; done"

# 觀察HPA輸出，大約1-2分鐘后會(huì)看到：
# NAME       REFERENCE        TARGETS  MINPODS  MAXPODS  REPLICAS  AGE
# php-apache-hpa  Deployment/php-apache  248%/50%  2     20    2     5m
# php-apache-hpa  Deployment/php-apache  248%/50%  2     20    10     6m

# 停止壓力后，等待5分鐘（縮容穩(wěn)定窗口），Pod數(shù)量會(huì)逐步縮回minReplicas

2.3.2 VPA推薦值驗(yàn)證

# 查看VPA推薦值
kubectl get vpa php-apache-vpa -o jsonpath='{.status.recommendation}'| jq .

# 輸出示例：
# {
#  "containerRecommendations": [
#   {
#    "containerName": "php-apache",
#    "lowerBound": { "cpu": "100m", "memory": "64Mi" },
#    "target": { "cpu": "250m", "memory": "180Mi" },
#    "uncappedTarget": { "cpu": "250m", "memory": "180Mi" },
#    "upperBound": { "cpu": "500m", "memory": "300Mi" }
#   }
#  ]
# }
# target就是VPA推薦的資源配置值

2.3.3 驗(yàn)證擴(kuò)縮容事件

# 查看HPA相關(guān)事件
kubectl describe hpa php-apache-hpa

# 關(guān)注Events部分：
# Events:
#  Type  Reason       Age  From            Message
#  ----  ------       ---- ----            -------
#  Normal SuccessfulRescale 2m  horizontal-pod-autoscaler New size: 10; reason: cpu resource utilization above target

# 查看Pod擴(kuò)縮容歷史
kubectl get events --field-selector reason=SuccessfulRescale --sort-by='.lastTimestamp'

三、示例代碼和配置

3.1 完整配置示例

3.1.1 生產(chǎn)級(jí)HPA完整配置

# 文件路徑：/opt/k8s/hpa/production-hpa.yaml
# 適用于生產(chǎn)環(huán)境的電商API服務(wù)HPA配置
apiVersion:autoscaling/v2
kind:HorizontalPodAutoscaler
metadata:
name:ecommerce-api-hpa
namespace:production
labels:
 app:ecommerce-api
 managed-by:hpa
annotations:
 # 記錄配置變更原因，方便審計(jì)
 kubernetes.io/change-cause:"初始HPA配置，目標(biāo)CPU 60%，QPS 1200/pod"
spec:
scaleTargetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:ecommerce-api
minReplicas:5
maxReplicas:100
metrics:
# 主指標(biāo)：CPU利用率
-type:Resource
 resource:
  name:cpu
  target:
   type:Utilization
   averageUtilization:60
# 輔助指標(biāo)：內(nèi)存利用率（防止內(nèi)存泄漏場(chǎng)景）
-type:Resource
 resource:
  name:memory
  target:
   type:Utilization
   averageUtilization:75
# 業(yè)務(wù)指標(biāo)：每Pod每秒請(qǐng)求數(shù)
-type:Pods
 pods:
  metric:
   name:http_requests_per_second
  target:
   type:AverageValue
   averageValue:"1200"
# 業(yè)務(wù)指標(biāo)：P99延遲（毫秒）
-type:Pods
 pods:
  metric:
   name:http_request_duration_p99_milliseconds
  target:
   type:AverageValue
   averageValue:"200"
behavior:
 scaleUp:
  stabilizationWindowSeconds:0
  policies:
  # 快速擴(kuò)容：每15秒最多擴(kuò)當(dāng)前數(shù)量的100%
  -type:Percent
   value:100
   periodSeconds:15
  # 兜底：每15秒至少能擴(kuò)5個(gè)Pod
  -type:Pods
   value:5
   periodSeconds:15
  selectPolicy:Max
 scaleDown:
  # 縮容穩(wěn)定窗口10分鐘，比默認(rèn)5分鐘更保守
  stabilizationWindowSeconds:600
  policies:
  # 慢速縮容：每60秒最多縮5%
  -type:Percent
   value:5
   periodSeconds:60
  selectPolicy:Min

3.1.2 VPA完整配置（Off模式+資源推薦腳本）

# 文件路徑：/opt/k8s/vpa/production-vpa.yaml
apiVersion:autoscaling.k8s.io/v1
kind:VerticalPodAutoscaler
metadata:
name:ecommerce-api-vpa
namespace:production
spec:
targetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:ecommerce-api
updatePolicy:
 updateMode:"Off"
resourcePolicy:
 containerPolicies:
 -containerName:ecommerce-api
  minAllowed:
   cpu:200m
   memory:256Mi
  maxAllowed:
   cpu:4
   memory:8Gi
  controlledResources:["cpu","memory"]
  controlledValues:RequestsAndLimits
 # sidecar容器單獨(dú)設(shè)置策略
 -containerName:istio-proxy
  mode:"Off"
 -containerName:filebeat
  minAllowed:
   cpu:50m
   memory:64Mi
  maxAllowed:
   cpu:500m
   memory:512Mi

3.1.3 Prometheus Adapter完整配置

# 文件路徑：/opt/k8s/prometheus-adapter/values.yaml
# Helm values文件
prometheus:
url:http://prometheus-server.monitoring.svc
port:9090

replicas:2

resources:
requests:
 cpu:100m
 memory:128Mi
limits:
 cpu:500m
 memory:512Mi

rules:
default:false
custom:
# HTTP請(qǐng)求速率
-seriesQuery:'http_requests_total{namespace!="",pod!=""}'
 resources:
  overrides:
   namespace:{resource:"namespace"}
   pod:{resource:"pod"}
 name:
  matches:"^(.*)_total$"
  as:"${1}_per_second"
 metricsQuery:'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
# HTTP P99延遲
-seriesQuery:'http_request_duration_seconds_bucket{namespace!="",pod!=""}'
 resources:
  overrides:
   namespace:{resource:"namespace"}
   pod:{resource:"pod"}
 name:
  as:"http_request_duration_p99_milliseconds"
 metricsQuery:'histogram_quantile(0.99, sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (le, <<.GroupBy>>)) * 1000'
# gRPC請(qǐng)求速率
-seriesQuery:'grpc_server_handled_total{namespace!="",pod!=""}'
 resources:
  overrides:
   namespace:{resource:"namespace"}
   pod:{resource:"pod"}
 name:
  matches:"^(.*)_total$"
  as:"${1}_per_second"
 metricsQuery:'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
# 消息隊(duì)列積壓
-seriesQuery:'rabbitmq_queue_messages{namespace!="",queue!=""}'
 resources:
  template:"<<.Resource>>"
 name:
  matches:"^(.*)"
  as:"$1"
 metricsQuery:'<<.Series>>{<<.LabelMatchers>>}'
resource:
 cpu:
  containerQuery:'sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[3m])) by (<<.GroupBy>>)'
  nodeQuery:'sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[3m])) by (<<.GroupBy>>)'
  resources:
   overrides:
    namespace:{resource:"namespace"}
    node:{resource:"node"}
    pod:{resource:"pod"}
  containerLabel:container
 memory:
  containerQuery:'sum(container_memory_working_set_bytes{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
  nodeQuery:'sum(container_memory_working_set_bytes{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
  resources:
   overrides:
    namespace:{resource:"namespace"}
    node:{resource:"node"}
    pod:{resource:"pod"}
  containerLabel:container

3.2 輔助腳本

3.2.1 VPA推薦值采集腳本

#!/bin/bash
# 文件名：vpa-report.sh
# 功能：采集所有VPA推薦值并生成CSV報(bào)告，用于資源規(guī)劃
# 用法：./vpa-report.sh [namespace]

NAMESPACE=${1:-"--all-namespaces"}
OUTPUT_FILE="/tmp/vpa-report-$(date +%Y%m%d-%H%M%S).csv"

if["$NAMESPACE"="--all-namespaces"];then
  NS_FLAG="-A"
else
  NS_FLAG="-n$NAMESPACE"
fi

echo"Namespace,VPA Name,Container,Target CPU,Target Memory,Lower CPU,Lower Memory,Upper CPU,Upper Memory">"$OUTPUT_FILE"

kubectl get vpa$NS_FLAG-o json | jq -r'
 .items[] |
 .metadata.namespace as $ns |
 .metadata.name as $vpa |
 (.status.recommendation.containerRecommendations // [])[] |
 [$ns, $vpa, .containerName,
 (.target.cpu // "N/A"), (.target.memory // "N/A"),
 (.lowerBound.cpu // "N/A"), (.lowerBound.memory // "N/A"),
 (.upperBound.cpu // "N/A"), (.upperBound.memory // "N/A")]
 | @csv
'>>"$OUTPUT_FILE"

echo"VPA報(bào)告已生成:$OUTPUT_FILE"
echo"---"
column -t -s','"$OUTPUT_FILE"

3.2.2 HPA狀態(tài)巡檢腳本

#!/bin/bash
# 文件名：hpa-check.sh
# 功能：檢查所有HPA的健康狀態(tài)，發(fā)現(xiàn)異常HPA
# 用法：./hpa-check.sh

echo"========== HPA健康檢查 =========="
echo"檢查時(shí)間:$(date '+%Y-%m-%d %H:%M:%S')"
echo""

# 檢查指標(biāo)獲取失敗的HPA
echo"--- 指標(biāo)獲取失敗的HPA ---"
kubectl get hpa -A -o json | jq -r'
 .items[] |
 select(.status.conditions[]? | select(.type=="ScalingActive" and .status=="False")) |
 [.metadata.namespace, .metadata.name,
 (.status.conditions[] | select(.type=="ScalingActive") | .message)]
 | @tsv
'|whileIFS=$'	'read-r ns name msg;do
 echo"[異常]$ns/$name:$msg"
done

echo""

# 檢查已達(dá)到最大副本數(shù)的HPA（可能需要調(diào)大maxReplicas）
echo"--- 已達(dá)最大副本數(shù)的HPA ---"
kubectl get hpa -A -o json | jq -r'
 .items[] |
 select(.status.currentReplicas >= .spec.maxReplicas) |
 [.metadata.namespace, .metadata.name,
 (.status.currentReplicas | tostring), (.spec.maxReplicas | tostring)]
 | @tsv
'|whileIFS=$'	'read-r ns name current max;do
 echo"[警告]$ns/$name: 當(dāng)前副本數(shù)$current已達(dá)上限$max，考慮調(diào)大maxReplicas"
done

echo""

# 檢查目標(biāo)利用率遠(yuǎn)低于閾值的HPA（可能資源浪費(fèi)）
echo"--- 資源可能浪費(fèi)的HPA（CPU利用率<20%且副本數(shù)>minReplicas）---"
kubectl get hpa -A -o json | jq -r'
 .items[] |
 select(.status.currentReplicas > .spec.minReplicas) |
 select(.status.currentMetrics[]? |
  select(.type=="Resource" and .resource.name=="cpu" and
     .resource.current.averageUtilization < 20)) |
? [.metadata.namespace, .metadata.name,
? ?(.status.currentMetrics[] | select(.type=="Resource" and .resource.name=="cpu") |
? ? .resource.current.averageUtilization | tostring),
? ?(.status.currentReplicas | tostring)]
? | @tsv
'?|?while?IFS=$'	'read?-r ns name cpu replicas;?do
? ??echo"[提示]?$ns/$name: CPU利用率僅?${cpu}%，當(dāng)前?$replicas?副本，可能存在資源浪費(fèi)"
done

echo""
echo"========== 檢查完成 =========="

3.3 實(shí)際應(yīng)用案例

案例一：電商大促場(chǎng)景——基于QPS的預(yù)測(cè)性擴(kuò)容

場(chǎng)景描述：電商平臺(tái)雙11大促，預(yù)計(jì)流量是平時(shí)的10倍。平時(shí)API服務(wù)5個(gè)Pod，每個(gè)Pod處理1200 QPS，總計(jì)6000 QPS。大促預(yù)計(jì)峰值60000 QPS。

實(shí)現(xiàn)方案：

大促前1小時(shí)，手動(dòng)將minReplicas調(diào)高到預(yù)估值，避免HPA擴(kuò)容速度跟不上瞬時(shí)流量

# 大促前：提前擴(kuò)容到預(yù)估值
kubectl patch hpa ecommerce-api-hpa -n production 
 --typemerge -p'{"spec":{"minReplicas":50}}'

# 確認(rèn)Pod已擴(kuò)到50
kubectl get deployment ecommerce-api -n production -w

大促期間HPA繼續(xù)工作，如果實(shí)際流量超過(guò)預(yù)估，HPA會(huì)繼續(xù)擴(kuò)容到maxReplicas

# 實(shí)時(shí)監(jiān)控HPA狀態(tài)
watch -n 5'kubectl get hpa ecommerce-api-hpa -n production'

大促結(jié)束后，逐步恢復(fù)minReplicas，不要一次性縮回去

# 大促結(jié)束2小時(shí)后，先縮到20
kubectl patch hpa ecommerce-api-hpa -n production 
 --typemerge -p'{"spec":{"minReplicas":20}}'

# 觀察1小時(shí)確認(rèn)流量穩(wěn)定后，縮回5
kubectl patch hpa ecommerce-api-hpa -n production 
 --typemerge -p'{"spec":{"minReplicas":5}}'

踩坑經(jīng)驗(yàn)：大促當(dāng)天直接靠HPA自動(dòng)擴(kuò)容是不夠的。HPA從檢測(cè)到指標(biāo)超標(biāo)到Pod Ready，整個(gè)鏈路需要：指標(biāo)采集間隔（15s）+ HPA計(jì)算間隔（15s）+ Pod調(diào)度（幾秒）+ 鏡像拉?。寸R像大小，可能30s-2min）+ 應(yīng)用啟動(dòng)（Java應(yīng)用可能30s-1min）。加起來(lái)可能要2-3分鐘，大促開(kāi)始瞬間的流量洪峰扛不住。所以必須提前手動(dòng)擴(kuò)容。

案例二：消息隊(duì)列消費(fèi)者——基于隊(duì)列積壓深度自動(dòng)擴(kuò)縮

場(chǎng)景描述：訂單處理服務(wù)從RabbitMQ消費(fèi)消息，正常情況下隊(duì)列深度保持在100以內(nèi)。當(dāng)上游突發(fā)大量訂單時(shí)，隊(duì)列積壓到幾萬(wàn)條，需要自動(dòng)擴(kuò)消費(fèi)者加速處理。

實(shí)現(xiàn)代碼：

# order-consumer-hpa.yaml
apiVersion:autoscaling/v2
kind:HorizontalPodAutoscaler
metadata:
name:order-consumer-hpa
namespace:production
spec:
scaleTargetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:order-consumer
minReplicas:2
maxReplicas:30
metrics:
# 基于隊(duì)列積壓深度：每個(gè)Pod分?jǐn)?0條消息
-type:External
 external:
  metric:
   name:rabbitmq_queue_messages
   selector:
    matchLabels:
     queue:"order-processing"
  target:
   type:AverageValue
   averageValue:"50"
behavior:
 scaleUp:
  stabilizationWindowSeconds:0
  policies:
  -type:Pods
   value:5
   periodSeconds:30
  selectPolicy:Max
 scaleDown:
  # 隊(duì)列消費(fèi)完后等10分鐘再縮，防止上游又來(lái)一波
  stabilizationWindowSeconds:600
  policies:
  -type:Pods
   value:2
   periodSeconds:60
  selectPolicy:Min

運(yùn)行效果：

# 正常狀態(tài)：隊(duì)列深度80，2個(gè)消費(fèi)者，每個(gè)分?jǐn)?0 < 50，不擴(kuò)容
# 積壓狀態(tài)：隊(duì)列深度5000，2個(gè)消費(fèi)者，每個(gè)分?jǐn)?500 > 50
# HPA計(jì)算：ceil(2 * 2500/50) = 100，但maxReplicas=30，所以擴(kuò)到30
# 30個(gè)消費(fèi)者處理，隊(duì)列深度快速下降
# 隊(duì)列清空后，等10分鐘穩(wěn)定窗口，開(kāi)始縮容，每分鐘縮2個(gè)

注意：消費(fèi)者擴(kuò)容時(shí)要確認(rèn)下游數(shù)據(jù)庫(kù)能扛住。30個(gè)消費(fèi)者同時(shí)寫庫(kù)，數(shù)據(jù)庫(kù)連接數(shù)和寫入壓力會(huì)暴增。建議消費(fèi)者內(nèi)部做限流或者用批量寫入。

案例三：Java應(yīng)用VPA資源優(yōu)化

場(chǎng)景描述：一個(gè)Spring Boot微服務(wù)，開(kāi)發(fā)給的資源配置是requests: cpu 1, memory 2Gi，limits: cpu 2, memory 4Gi。實(shí)際運(yùn)行后發(fā)現(xiàn)CPU平均使用率只有15%，內(nèi)存穩(wěn)定在800Mi左右。用VPA分析真實(shí)資源需求。

實(shí)現(xiàn)步驟：

部署VPA（Off模式）

apiVersion:autoscaling.k8s.io/v1
kind:VerticalPodAutoscaler
metadata:
name:user-service-vpa
namespace:production
spec:
targetRef:
 apiVersion:apps/v1
 kind:Deployment
 name:user-service
updatePolicy:
 updateMode:"Off"
resourcePolicy:
 containerPolicies:
 -containerName:user-service
  minAllowed:
   cpu:100m
   memory:256Mi
  maxAllowed:
   cpu:4
   memory:8Gi

運(yùn)行一周后查看推薦值

kubectl get vpa user-service-vpa -n production -o json | jq'.status.recommendation'
# 輸出：
# {
#  "containerRecommendations": [{
#   "containerName": "user-service",
#   "target": { "cpu": "350m", "memory": "900Mi" },
#   "lowerBound": { "cpu": "200m", "memory": "700Mi" },
#   "upperBound": { "cpu": "800m", "memory": "1200Mi" }
#  }]
# }

根據(jù)推薦值調(diào)整Deployment資源配置

# requests用target值，limits用upperBound的1.5倍留余量
kubectl patch deployment user-service -n production --type=json -p='[
 {"op":"replace","path":"/spec/template/spec/containers/0/resources","value":{
  "requests":{"cpu":"350m","memory":"900Mi"},
  "limits":{"cpu":"1200m","memory":"1800Mi"}
 }}
]'

優(yōu)化效果：

CPU requests從1000m降到350m，節(jié)省65%

內(nèi)存requests從2Gi降到900Mi，節(jié)省56%

一個(gè)20副本的服務(wù)，每年節(jié)省的云資源費(fèi)用約￥15000（按阿里云ECS計(jì)算）

四、最佳實(shí)踐和注意事項(xiàng)

4.1 最佳實(shí)踐

4.1.1 性能優(yōu)化

合理設(shè)置HPA指標(biāo)采集間隔：默認(rèn)HPA每15秒計(jì)算一次，Metrics Server每15秒采集一次。如果業(yè)務(wù)流量變化極快（秒殺場(chǎng)景），可以調(diào)小Metrics Server的--metric-resolution到10s，但不要低于10s，否則kubelet壓力太大。

# 修改Metrics Server采集間隔
kubectl edit deployment metrics-server -n kube-system
# 將 --metric-resolution=15s 改為 --metric-resolution=10s

Pod啟動(dòng)速度優(yōu)化：HPA擴(kuò)出來(lái)的Pod越快Ready越好。幾個(gè)關(guān)鍵優(yōu)化點(diǎn)：

# 鏡像預(yù)拉取DaemonSet
apiVersion:apps/v1
kind:DaemonSet
metadata:
name:image-prepull
namespace:kube-system
spec:
selector:
 matchLabels:
  app:image-prepull
template:
 metadata:
  labels:
   app:image-prepull
 spec:
  initContainers:
  -name:prepull
   image:your-registry/ecommerce-api:latest
   command:["sh","-c","echo Image pulled"]
  containers:
  -name:pause
   image:registry.k8s.io/pause:3.9

使用小鏡像：Alpine基礎(chǔ)鏡像比Ubuntu小10倍，拉取時(shí)間從30s降到3s

預(yù)拉取鏡像：用DaemonSet在每個(gè)節(jié)點(diǎn)預(yù)拉取業(yè)務(wù)鏡像

配置合理的readinessProbe：initialDelaySeconds不要設(shè)太大，Java應(yīng)用建議10-15s，Go應(yīng)用3-5s

開(kāi)啟Pod Topology Spread：避免新Pod全調(diào)度到同一個(gè)節(jié)點(diǎn)

縮容策略要保守：生產(chǎn)環(huán)境縮容穩(wěn)定窗口建議設(shè)為600秒（10分鐘），縮容速率每分鐘不超過(guò)10%。見(jiàn)過(guò)太多次流量短暫下降觸發(fā)縮容，結(jié)果2分鐘后流量又上來(lái)了，Pod又要重新擴(kuò)，來(lái)回折騰影響服務(wù)質(zhì)量。

4.1.2 安全加固

設(shè)置合理的maxReplicas上限：maxReplicas一定要根據(jù)集群實(shí)際資源容量設(shè)置。一個(gè)3節(jié)點(diǎn)集群每節(jié)點(diǎn)32核128G，總共96核384G，如果每個(gè)Pod requests 1核2G，maxReplicas最多設(shè)到80左右（留一些給系統(tǒng)組件）。設(shè)成1000，HPA真擴(kuò)到那個(gè)數(shù)，節(jié)點(diǎn)資源耗盡，所有Pod都會(huì)被驅(qū)逐。

# 檢查集群可分配資源
kubectl describe nodes | grep -A 5"Allocated resources"

RBAC權(quán)限控制：限制誰(shuí)能修改HPA配置。生產(chǎn)環(huán)境HPA的minReplicas和maxReplicas修改應(yīng)該走變更審批流程，不能隨便改。

# 限制HPA修改權(quán)限的Role
apiVersion:rbac.authorization.k8s.io/v1
kind:Role
metadata:
name:hpa-viewer
namespace:production
rules:
-apiGroups:["autoscaling"]
resources:["horizontalpodautoscalers"]
verbs:["get","list","watch"]
# 只有SRE團(tuán)隊(duì)有修改權(quán)限
---
apiVersion:rbac.authorization.k8s.io/v1
kind:Role
metadata:
name:hpa-admin
namespace:production
rules:
-apiGroups:["autoscaling"]
resources:["horizontalpodautoscalers"]
verbs:["get","list","watch","create","update","patch","delete"]

PodDisruptionBudget配合HPA：HPA縮容時(shí)會(huì)刪Pod，配合PDB確?？s容過(guò)程中始終有足夠的Pod在運(yùn)行。

apiVersion:policy/v1
kind:PodDisruptionBudget
metadata:
name:ecommerce-api-pdb
namespace:production
spec:
minAvailable:"60%"
selector:
 matchLabels:
  app:ecommerce-api

4.1.3 高可用配置

HPA本身的高可用：HPA Controller運(yùn)行在kube-controller-manager中，跟隨控制平面高可用。多Master集群中只有Leader節(jié)點(diǎn)的controller-manager在工作，其他節(jié)點(diǎn)待命。不需要額外配置。

Metrics Server高可用：生產(chǎn)環(huán)境Metrics Server建議部署2個(gè)副本，配合PDB。

kubectl scale deployment metrics-server -n kube-system --replicas=2

備份策略：HPA和VPA配置都是K8s資源對(duì)象，跟隨etcd備份。建議同時(shí)在Git倉(cāng)庫(kù)中維護(hù)一份YAML，用GitOps方式管理。

4.2 注意事項(xiàng)

4.2.1 配置注意事項(xiàng)

警告：以下幾個(gè)配置錯(cuò)誤會(huì)導(dǎo)致嚴(yán)重生產(chǎn)事故，改之前務(wù)必確認(rèn)。

Deployment必須設(shè)置resources.requests：沒(méi)有requests，HPA無(wú)法計(jì)算CPU/內(nèi)存利用率百分比，會(huì)報(bào)FailedGetResourceMetric錯(cuò)誤。這是最常見(jiàn)的HPA不工作原因。

HPA和VPA不要同時(shí)基于CPU/內(nèi)存指標(biāo)運(yùn)行Auto模式：HPA說(shuō)"CPU高了加Pod"，VPA說(shuō)"CPU高了加資源"，兩個(gè)控制器互相打架，副本數(shù)和資源配置會(huì)反復(fù)震蕩。

minReplicas不要設(shè)為1：縮容到1個(gè)Pod，這個(gè)Pod掛了或者所在節(jié)點(diǎn)故障，服務(wù)直接中斷。生產(chǎn)環(huán)境最少設(shè)為2，核心服務(wù)設(shè)為3。

4.2.2 常見(jiàn)錯(cuò)誤

錯(cuò)誤現(xiàn)象	原因分析	解決方案
HPA顯示/50%	Metrics Server未安裝或Pod沒(méi)設(shè)requests	安裝Metrics Server，給Pod加resources.requests
HPA TARGETS顯示正常但不擴(kuò)容	當(dāng)前指標(biāo)值未超過(guò)目標(biāo)值，或tolerance范圍內(nèi)（默認(rèn)10%）	確認(rèn)指標(biāo)值確實(shí)超過(guò)目標(biāo)值的110%
HPA擴(kuò)容后Pod一直Pending	集群資源不足，新Pod無(wú)法調(diào)度	擴(kuò)容集群節(jié)點(diǎn)或配置Cluster Autoscaler
VPA推薦值一直為空	VPA Recommender未運(yùn)行或數(shù)據(jù)采集時(shí)間不夠	檢查vpa-recommender Pod日志，等待至少8小時(shí)
VPA Auto模式Pod頻繁重啟	VPA每次調(diào)整資源都要重建Pod	設(shè)置PDB限制同時(shí)重建的Pod數(shù)量，或改用Off模式
自定義指標(biāo)HPA報(bào)no matches for kind "ExternalMetric"	Prometheus Adapter配置錯(cuò)誤或未注冊(cè)API	檢查kubectl get apiservices中custom.metrics相關(guān)API狀態(tài)

4.2.3 兼容性問(wèn)題

版本兼容：HPA v2 API在K8s 1.23 GA。1.18-1.22用autoscaling/v2beta2，1.18以下用autoscaling/v2beta1。升級(jí)集群后記得把HPA的apiVersion也改過(guò)來(lái)。

平臺(tái)兼容：各云廠商的托管K8s（EKS、AKS、GKE、ACK）都內(nèi)置了Metrics Server，不需要手動(dòng)安裝。但自定義指標(biāo)擴(kuò)縮需要自己部署Prometheus Adapter。

組件依賴：VPA和Cluster Autoscaler（CA）可以配合使用。VPA調(diào)大Pod資源后如果節(jié)點(diǎn)放不下，CA會(huì)自動(dòng)擴(kuò)節(jié)點(diǎn)。但要注意CA的擴(kuò)節(jié)點(diǎn)速度（通常2-5分鐘），這段時(shí)間Pod會(huì)處于Pending狀態(tài)。

五、故障排查和監(jiān)控

5.1 故障排查

5.1.1 日志查看

# 查看HPA Controller日志（在kube-controller-manager中）
kubectl logs -n kube-system $(kubectl get pods -n kube-system -l component=kube-controller-manager -o name | head -1) | grep -i"horizontal"

# 查看Metrics Server日志
kubectl logs -n kube-system -l k8s-app=metrics-server -f

# 查看VPA Recommender日志
kubectl logs -n kube-system -l app=vpa-recommender -f

# 查看VPA Updater日志
kubectl logs -n kube-system -l app=vpa-updater -f

# 查看Prometheus Adapter日志
kubectl logs -n monitoring -l app=prometheus-adapter -f

5.1.2 常見(jiàn)問(wèn)題排查

問(wèn)題一：HPA TARGETS顯示

# 第一步：確認(rèn)Metrics Server是否正常
kubectl get apiservices | grep metrics
# 應(yīng)該看到：v1beta1.metrics.k8s.io  kube-system/metrics-server  True

# 第二步：確認(rèn)能獲取到Pod指標(biāo)
kubectl top pods -n default
# 如果報(bào)錯(cuò)，查看Metrics Server日志
kubectl logs -n kube-system -l k8s-app=metrics-server --tail=50

# 第三步：確認(rèn)Pod有設(shè)置resources.requests
kubectl get deployment php-apache -o jsonpath='{.spec.template.spec.containers[0].resources}'

解決方案：

Metrics Server未安裝 → 按2.1.2節(jié)安裝

Metrics Server連不上kubelet → 加--kubelet-insecure-tls參數(shù)

Pod沒(méi)設(shè)requests → 給Deployment加resources.requests

問(wèn)題二：HPA擴(kuò)容速度太慢，流量已經(jīng)上來(lái)了Pod還沒(méi)擴(kuò)夠

# 檢查HPA當(dāng)前狀態(tài)和事件
kubectl describe hpa ecommerce-api-hpa -n production

# 檢查擴(kuò)容策略配置
kubectl get hpa ecommerce-api-hpa -n production -o jsonpath='{.spec.behavior.scaleUp}'| jq .

解決方案：

調(diào)整擴(kuò)容策略，增大每次擴(kuò)容的比例：

behavior:
scaleUp:
 stabilizationWindowSeconds:0
 policies:
 -type:Percent
  value:200# 每次最多擴(kuò)到當(dāng)前的3倍
  periodSeconds:15
 -type:Pods
  value:10 # 每次至少擴(kuò)10個(gè)
  periodSeconds:15
 selectPolicy:Max

對(duì)于可預(yù)測(cè)的流量高峰，提前手動(dòng)調(diào)高minReplicas

問(wèn)題三：VPA Auto模式導(dǎo)致服務(wù)抖動(dòng)

癥狀：VPA頻繁重建Pod，服務(wù)出現(xiàn)間歇性不可用

排查：

# 查看VPA更新歷史
kubectl get events -n production --field-selector reason=EvictedByVPA --sort-by='.lastTimestamp'

# 查看VPA推薦值變化
kubectl get vpa -n production -o json | jq'.items[].status.recommendation'

解決：

將updateMode改為Off，手動(dòng)應(yīng)用推薦值

如果必須用Auto，配置PDB限制同時(shí)驅(qū)逐的Pod數(shù)量

調(diào)大VPA的minAllowed和maxAllowed范圍，減少推薦值波動(dòng)

5.1.3 調(diào)試模式

# 提高kube-controller-manager的HPA日志級(jí)別
# 編輯kube-controller-manager的啟動(dòng)參數(shù)，加上：
# --v=4 （會(huì)輸出HPA的詳細(xì)計(jì)算過(guò)程）

# 查看HPA詳細(xì)計(jì)算過(guò)程
kubectl logs -n kube-system kube-controller-manager-master01 | grep -A 10"autoscaler"

# 手動(dòng)觸發(fā)Metrics API查看原始指標(biāo)數(shù)據(jù)
kubectl get --raw"/apis/metrics.k8s.io/v1beta1/namespaces/default/pods"| jq .

# 查看自定義指標(biāo)原始數(shù)據(jù)
kubectl get --raw"/apis/custom.metrics.k8s.io/v1beta1/namespaces/production/pods/*/http_requests_per_second"| jq .

5.2 性能監(jiān)控

5.2.1 關(guān)鍵指標(biāo)監(jiān)控

# 查看所有HPA狀態(tài)概覽
kubectl get hpa -A -o wide

# 查看特定HPA的詳細(xì)指標(biāo)
kubectl get hpa ecommerce-api-hpa -n production -o yaml | grep -A 20"currentMetrics"

# 監(jiān)控Pod資源使用率
kubectl top pods -n production --sort-by=cpu
kubectl top pods -n production --sort-by=memory

# 監(jiān)控節(jié)點(diǎn)資源使用率（判斷是否需要擴(kuò)節(jié)點(diǎn)）
kubectl top nodes

5.2.2 監(jiān)控指標(biāo)說(shuō)明

指標(biāo)名稱	正常范圍	告警閾值	說(shuō)明
HPA當(dāng)前副本數(shù)/最大副本數(shù)	< 80%	>= 90%	接近上限說(shuō)明可能需要調(diào)大maxReplicas
HPA目標(biāo)指標(biāo)達(dá)成率	90%-110%	> 150% 持續(xù)5分鐘	指標(biāo)持續(xù)超標(biāo)說(shuō)明擴(kuò)容速度跟不上
Pod CPU利用率	40%-70%	> 85% 持續(xù)3分鐘	利用率過(guò)高說(shuō)明HPA目標(biāo)值設(shè)太高
Pod內(nèi)存利用率	50%-80%	> 90%	接近limits會(huì)被OOMKill
Metrics Server延遲	< 1s	> 5s	延遲過(guò)高會(huì)導(dǎo)致HPA決策滯后
VPA推薦值與實(shí)際值偏差	< 20%	> 50%	偏差過(guò)大說(shuō)明資源配置不合理

5.2.3 監(jiān)控告警配置

# Prometheus告警規(guī)則：hpa-alerts.yaml
apiVersion:monitoring.coreos.com/v1
kind:PrometheusRule
metadata:
name:hpa-alerts
namespace:monitoring
spec:
groups:
-name:hpa.rules
 rules:
 # HPA已達(dá)最大副本數(shù)
 -alert:HPAMaxedOut
  expr:|
    kube_horizontalpodautoscaler_status_current_replicas
    ==
    kube_horizontalpodautoscaler_spec_max_replicas
  for:10m
  labels:
   severity:warning
  annotations:
   summary:"HPA{{ $labels.namespace }}/{{ $labels.horizontalpodautoscaler }}已達(dá)最大副本數(shù)"
   description:"當(dāng)前副本數(shù)已達(dá)maxReplicas上限，持續(xù)10分鐘?？紤]調(diào)大maxReplicas或優(yōu)化服務(wù)性能。"

 # HPA無(wú)法獲取指標(biāo)
 -alert:HPAScalingInactive
  expr:|
    kube_horizontalpodautoscaler_status_condition{condition="ScalingActive",status="false"} == 1
  for:5m
  labels:
   severity:critical
  annotations:
   summary:"HPA{{ $labels.namespace }}/{{ $labels.horizontalpodautoscaler }}無(wú)法獲取指標(biāo)"
   description:"HPA ScalingActive條件為False，自動(dòng)伸縮功能失效。檢查Metrics Server或Prometheus Adapter。"

 # HPA副本數(shù)為0（服務(wù)完全不可用）
 -alert:HPAReplicasZero
  expr:|
    kube_horizontalpodautoscaler_status_current_replicas == 0
  for:1m
  labels:
   severity:critical
  annotations:
   summary:"HPA{{ $labels.namespace }}/{{ $labels.horizontalpodautoscaler }}副本數(shù)為0"
   description:"服務(wù)副本數(shù)為0，服務(wù)完全不可用。立即檢查。"

 # CPU利用率持續(xù)過(guò)高
 -alert:HighCPUUtilization
  expr:|
    avg(rate(container_cpu_usage_seconds_total{container!="",pod!=""}[5m])) by (namespace, pod)
    /
    avg(kube_pod_container_resource_requests{resource="cpu"}) by (namespace, pod)
    > 0.9
  for:5m
  labels:
   severity:warning
  annotations:
   summary:"Pod{{ $labels.namespace }}/{{ $labels.pod }}CPU利用率超過(guò)90%"

 # Metrics Server不可用
 -alert:MetricsServerDown
  expr:|
    up{job="metrics-server"} == 0
  for:2m
  labels:
   severity:critical
  annotations:
   summary:"Metrics Server不可用，所有HPA將無(wú)法工作"

5.3 備份與恢復(fù)

5.3.1 備份策略

#!/bin/bash
# 文件名：backup-hpa-vpa.sh
# 功能：備份所有HPA和VPA配置到文件

BACKUP_DIR="/opt/k8s/backup/autoscaling/$(date +%Y%m%d)"
mkdir -p"$BACKUP_DIR"

# 備份所有HPA配置
echo"備份HPA配置..."
fornsin$(kubectl get hpa -A -o jsonpath='{range .items[*]}{.metadata.namespace}{"
"}{end}'| sort -u);do
  mkdir -p"$BACKUP_DIR/hpa/$ns"
  kubectl get hpa -n"$ns"-o yaml >"$BACKUP_DIR/hpa/$ns/all-hpa.yaml"
done

# 備份所有VPA配置
echo"備份VPA配置..."
fornsin$(kubectl get vpa -A -o jsonpath='{range .items[*]}{.metadata.namespace}{"
"}{end}'| sort -u);do
  mkdir -p"$BACKUP_DIR/vpa/$ns"
  kubectl get vpa -n"$ns"-o yaml >"$BACKUP_DIR/vpa/$ns/all-vpa.yaml"
done

# 備份Prometheus Adapter配置
echo"備份Prometheus Adapter配置..."
kubectl get configmap prometheus-adapter -n monitoring -o yaml >"$BACKUP_DIR/prometheus-adapter-config.yaml"2>/dev/null

echo"備份完成:$BACKUP_DIR"
ls -lR"$BACKUP_DIR"

5.3.2 恢復(fù)流程

確認(rèn)當(dāng)前狀態(tài)：kubectl get hpa,vpa -A

恢復(fù)HPA配置：kubectl apply -f /opt/k8s/backup/autoscaling/20260208/hpa/

恢復(fù)VPA配置：kubectl apply -f /opt/k8s/backup/autoscaling/20260208/vpa/

驗(yàn)證恢復(fù)結(jié)果：kubectl get hpa,vpa -A確認(rèn)所有資源已恢復(fù)

六、總結(jié)

6.1 技術(shù)要點(diǎn)回顧

HPA核心公式：desiredReplicas = ceil(currentReplicas * currentMetricValue / desiredMetricValue)，理解這個(gè)公式就理解了HPA的所有行為

VPA生產(chǎn)用法：Off模式獲取推薦值 → 人工審核 → 手動(dòng)調(diào)整資源配置，這是最穩(wěn)妥的路徑

behavior策略：擴(kuò)容用Max策略快速響應(yīng)，縮容用Min策略平滑收縮，stabilizationWindowSeconds是防抖的關(guān)鍵參數(shù)

HPA+VPA混合：VPA設(shè)Off模式做資源顧問(wèn)，HPA負(fù)責(zé)實(shí)際擴(kuò)縮，兩者不要同時(shí)基于CPU/內(nèi)存做Auto

6.2 進(jìn)階學(xué)習(xí)方向

KEDA（Kubernetes Event-Driven Autoscaling）：比原生HPA更靈活的事件驅(qū)動(dòng)擴(kuò)縮方案，支持從0縮放，內(nèi)置50+觸發(fā)器（Kafka、RabbitMQ、Redis、Cron等）

項(xiàng)目地址：https://keda.sh

實(shí)踐建議：如果你的HPA主要基于消息隊(duì)列或外部指標(biāo)擴(kuò)縮，KEDA比Prometheus Adapter方案簡(jiǎn)單很多

Cluster Autoscaler與HPA聯(lián)動(dòng)：HPA擴(kuò)Pod，CA擴(kuò)節(jié)點(diǎn)，兩者配合實(shí)現(xiàn)完整的彈性伸縮鏈路

文檔：https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler

實(shí)踐建議：配置CA的擴(kuò)容冷卻時(shí)間和縮容延遲，避免節(jié)點(diǎn)頻繁增減

Multidimensional Pod Autoscaler（MPA）：Google內(nèi)部使用的方案，同時(shí)做水平和垂直伸縮，社區(qū)有類似的開(kāi)源實(shí)現(xiàn)

實(shí)踐建議：關(guān)注K8s社區(qū)SIG-Autoscaling的進(jìn)展

6.3 參考資料

Kubernetes HPA官方文檔- HPA配置和算法詳解

VPA GitHub倉(cāng)庫(kù)- VPA安裝和配置指南

Prometheus Adapter文檔- 自定義指標(biāo)配置

KEDA官方文檔- 事件驅(qū)動(dòng)自動(dòng)伸縮

附錄

A. 命令速查表

# HPA操作
kubectl get hpa -A                  # 查看所有HPA
kubectl get hpa  -o wide            # 查看HPA詳情
kubectl describe hpa              # 查看HPA事件和狀態(tài)
kubectl autoscale deployment  --min=2 --max=10 --cpu-percent=50 # 快速創(chuàng)建HPA
kubectl patch hpa  --typemerge -p'{"spec":{"minReplicas":10}}'# 修改minReplicas
kubectl delete hpa               # 刪除HPA

# VPA操作
kubectl get vpa -A                  # 查看所有VPA
kubectl get vpa  -o jsonpath='{.status.recommendation}'| jq . # 查看推薦值
kubectl delete vpa               # 刪除VPA

# Metrics查看
kubectl top nodes                  # 節(jié)點(diǎn)資源使用
kubectl top pods -n  --sort-by=cpu       # Pod CPU排序
kubectl top pods -n  --sort-by=memory      # Pod內(nèi)存排序
kubectl get --raw"/apis/metrics.k8s.io/v1beta1/pods"| jq . # 原始指標(biāo)數(shù)據(jù)
kubectl get --raw"/apis/custom.metrics.k8s.io/v1beta1"| jq .# 自定義指標(biāo)列表

B. 配置參數(shù)詳解

HPA behavior參數(shù)：

參數(shù)	默認(rèn)值	說(shuō)明
scaleUp.stabilizationWindowSeconds	0	擴(kuò)容穩(wěn)定窗口，0表示立即擴(kuò)容
scaleDown.stabilizationWindowSeconds	300	縮容穩(wěn)定窗口，默認(rèn)5分鐘
policies[].type	-	Pods（固定數(shù)量）或Percent（百分比）
policies[].value	-	每次擴(kuò)縮的數(shù)量或百分比
policies[].periodSeconds	-	策略執(zhí)行間隔，最小1秒，最大1800秒
selectPolicy	Max	Max取最大變更量，Min取最小變更量，Disabled禁止擴(kuò)縮

VPA resourcePolicy參數(shù)：

參數(shù)	說(shuō)明
containerName	容器名，"*"表示所有容器
mode	Auto（自動(dòng)調(diào)整）或Off（不調(diào)整該容器）
minAllowed	最小資源限制，VPA推薦值不會(huì)低于此值
maxAllowed	最大資源限制，VPA推薦值不會(huì)高于此值
controlledResources	控制哪些資源，可選cpu、memory
controlledValues	RequestsOnly（只調(diào)requests）或RequestsAndLimits（同時(shí)調(diào)）

C. 術(shù)語(yǔ)表

術(shù)語(yǔ)	英文	解釋
水平伸縮	Horizontal Scaling	通過(guò)增減Pod副本數(shù)來(lái)調(diào)整服務(wù)容量
垂直伸縮	Vertical Scaling	通過(guò)調(diào)整單個(gè)Pod的CPU/內(nèi)存配額來(lái)調(diào)整服務(wù)容量
穩(wěn)定窗口	Stabilization Window	HPA在做擴(kuò)縮決策前等待的時(shí)間窗口，用于防止指標(biāo)抖動(dòng)導(dǎo)致頻繁擴(kuò)縮
指標(biāo)服務(wù)器	Metrics Server	K8s集群級(jí)資源指標(biāo)聚合器，提供CPU和內(nèi)存使用數(shù)據(jù)
推薦器	Recommender	VPA組件，分析歷史資源使用數(shù)據(jù)并生成資源推薦值
更新器	Updater	VPA組件，負(fù)責(zé)驅(qū)逐需要更新資源配置的Pod
準(zhǔn)入控制器	Admission Controller	VPA組件，在Pod創(chuàng)建時(shí)注入推薦的資源配置
自定義指標(biāo)	Custom Metrics	用戶自定義的業(yè)務(wù)指標(biāo)，如QPS、隊(duì)列深度等，通過(guò)Prometheus Adapter暴露給K8s

聲明：本文內(nèi)容及配圖由入駐作者撰寫或者入駐合作網(wǎng)站授權(quán)轉(zhuǎn)載。文章觀點(diǎn)僅代表作者本人，不代表電子發(fā)燒友網(wǎng)立場(chǎng)。文章及其配圖僅供工程師學(xué)習(xí)之用，如有內(nèi)容侵權(quán)或者其他違規(guī)問(wèn)題，請(qǐng)聯(lián)系本站處理。舉報(bào)投訴

cpu

cpu

+關(guān)注

關(guān)注
68

文章
11332

瀏覽量
225991
內(nèi)存

內(nèi)存

+關(guān)注

關(guān)注
9

文章
3238

瀏覽量
76528
kubernetes

kubernetes

+關(guān)注

關(guān)注
0

文章
275

瀏覽量
9538

原文標(biāo)題：從流量突增到資源優(yōu)化：Kubernetes HPA+VPA 實(shí)戰(zhàn)指南

文章出處：【微信號(hào)：magedu-Linux，微信公眾號(hào)：馬哥Linux運(yùn)維】歡迎添加關(guān)注！文章轉(zhuǎn)載請(qǐng)注明出處。

日B视频亚洲,啪啪啪网站一区二区,91色情精品久久,日日噜狠狠色综合久,超碰人妻少妇97在线,999青青视频,亚洲一区二卡,让本一区二区视频,日韩网站推荐

搜索歷史

Kubernetes HPA和VPA使用實(shí)戰(zhàn)指南

評(píng)論