Skip to content

kube-state-metrics

Errors

kube-state-metrics
bash
E0624 16:46:27.590502       1 metrics_handler.go:227] "Failed to write metrics" ^CE0624 16:47:27.573964       1 metrics_handler.go:227] "Failed to write metrics" err="failed to write help text: write tcp 100.81.9.63:8080->100.81.9.62:44078: write: connection reset by peer"
E0624 16:47:37.556174       1 metrics_handler.go:227] "Failed to write metrics" err="failed to write metrics family: write tcp 100.81.9.63:8080->100.81.9.62:59302: write: broken pipe"
vm-agent
bash
{"ts":"2025-06-25T02:52:13.002Z","level":"warn","caller":"VictoriaMetrics/lib/promscrape/scrapework.go:383","msg":"cannot scrape target \"http://100.80.236.166:8080/metrics\" ({container=\"kube-state-metrics\",endpoint=\"http\",instance=\"100.80.236.166:8080\",job=\"vm-k8s-kube-state-metrics\",namespace=\"monitor\",pod=\"vm-k8s-kube-state-metrics-6c994db859-2g9gt\",service=\"vm-k8s-kube-state-metrics\"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: the response from \"http://100.80.236.166:8080/metrics\" exceeds -promscrape.maxScrapeSize or max_scrape_size in the scrape config (16777216 bytes). Possible solutions are: reduce the response size for the target, increase -promscrape.maxScrapeSize command-line flag, increase max_scrape_size value in scrape config for the given target"}
  • 100.81.9.63:8080: kube-state-metrics endpoint
  • 100.81.9.62:59302: vm-agent endpoint
  • 集群中的 namespace 数量太多,导致指标过多无法抓取

Test Metrics

Port Forward
bash
kubectl port-forward pod/xxx 8080:8080
Get Metrics
bash
curl localhost:8080/metrics | head -n 50
Check Size
bash
curl localhost:8080/metrics > metrics.txt
Errors
bash
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:30 --:--:--     0
curl: (56) Recv failure: Connection reset by peer
  • 由于命名空间多、资源量大,/metrics 数据量非常大
  • 连接中途被 重置(connection reset by peer)
  • 导致 curl 失败,无法获取完整数据。

Solutions

Specify Namespace

yaml
spec:
    automountServiceAccountToken: true
    containers:
    - args:
    - --port=8080
    - --namespaces=default,kube-system,monitor