Skip to main content

Communication Errors


Background

  • ✅ MySQL can connect from its own pod.
  • ❌ MySQL can't connect by K8s service.
    • Service endpoint is right.
    • getent inside pod, sometimes failed
    • Pod network issue(Calico CNI).

Errors

JDBC Errors
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
...
The driver has not received any packets from the server
MySQL Errors
2025-07-29T02:45:56.595368Z 21528 [Note] Aborted connection 21528 to db: 'foo' user: 'root' host: '10.x.x.x' (Got an error reading communication packets)

Solutions

Check MySQL Inside

kubectl exec -it mysql-0 -- /bin/bash
mysql -uroot -p

Check Pod Network

If it sometimes fails, some CNI (Calico) pods might be down.

getent ahosts mysql-haproxy.ns.svc.cluster.local
10.x.x.x   STREAM mysql-haproxy.ns.svc.cluster.local
10.x.x.x   DGRAM
10.x.x.x   RAW

Check CNI(Calico) Pods

kubectl get pods | grep calico
calico-kube-controllers-6f697bd6bc-8s6df   1/1     Running   1 (17h ago)   27d
calico-node-jq9lc                          0/1     Running   0             6s
calico-node-ms4wt                          1/1     Running   2 (16h ago)   27d
calico-node-q4z9q                          1/1     Running   2 (16h ago)   27d

Delete calico-node-jq9lc to fix the problem.

kubectl delete pod calico-node-jq9lc

Calico Error Logs

kubectl logs calico-node-ms4wt --previous
2025-07-28 09:20:42.273 [WARNING][96] felix/route_table.go 740: Failed to delete route error=no such process ifaceName="cali125e9a7ccf9" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:20:44.002 [INFO][96] felix/route_table.go 1205: Failed to access interface because it doesn't exist. error=Link not found ifaceName="cali6ac61bc7d83" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:20:44.002 [INFO][96] felix/route_table.go 1273: Failed to get interface; it's down/gone. error=Link not found ifaceName="cali6ac61bc7d83" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:21:08.572 [INFO][96] felix/route_table.go 1205: Failed to access interface because it doesn't exist. error=Link not found ifaceName="califb58b76f486" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:21:08.572 [INFO][96] felix/route_table.go 1273: Failed to get interface; it's down/gone. error=Link not found ifaceName="califb58b76f486" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:50:13.390 [INFO][98] confd/watchercache.go 125: Watch error received from Upstream ListRoot="/calico/ipam/v2/host/ingress-2" error=too old resource version: 1057 (7933983)

Failed to Get Interface

Failed to get interface; it's down/gone. error=Link not found ifaceName="cali125e9a7ccf9"

Calico 想处理某个接口(如 cali125e9a7ccf9),但内核返回接口已不存在。

  • 相关 Pod 被删除,但 Calico 还没完全清理对应路由。
  • 或 Pod 快速重建时,旧接口消失,Felix 还没感知。

Failed to Delete Route

Failed to delete route error=no such process ifaceName="cali125e9a7ccf9"

Calico 想删掉某个与该接口绑定的路由,但系统报错 no such process。 这是一种无害的警告,意味着系统状态已提前变更,Calico 的清理动作重复了一次。

Too Old Resource Version

Watch error received from Upstream ListRoot="/calico/ipam/v2/host/ingress-2" error=too old resource version

confd(Calico 内部的配置同步模块)从 etcd(或 kube-apiserver)watch 数据失败,提示 resource version 太旧。 这是 Kubernetes watch 机制的正常现象,如果版本太老(如本地断网或负载高导致滞后),会断掉重新全量拉取。

Calico Troubleshooting

Pod Network Interface

ip route | grep cali
10.244.181.190 dev calia3bb7ac8316 scope link
字段 含义
10.244.181.190 目标 Pod 的 IP 地址(属于 Pod CIDR)
dev calia3bb7ac8316 通过的网络接口(Calico 自动创建的 veth 接口)
scope link 表示该路由只在本地链路上有效,即直接连接的路由

列出了 Linux 路由表中所有通过 cali* 接口的路由,而 cali* 是 Calico CNI 插件为每个 Pod 创建的虚拟网络接口(veth pair)的一端。 这些路由表示本机(如这个节点)上运行的 Pod,其 Pod IP 地址通过 Calico 创建的虚拟接口进行访问。

换句话说:这是 Calico 的工作机制,每个 Pod 被分配一个 IP,然后 Calico 给每个 Pod 创建一个 veth 网络接口,Linux 的路由表就加了一条通往这个 IP 的路由,通过相应的 caliXXXX 接口发包。

查看 IP 关联的 Pod
kubectl get po -o wide -A | grep 10.244.181.190

Network Traffic

XDP(eXpress Data Path), if disabled, the data is 0.

ethtool -S calia3bb7ac8316
NIC statistics:
     peer_ifindex: 4
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_xdp_drops: 0

Check interface details.

39: cali99a6c2c8fff@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP mode DEFAULT group default 
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 5

Check interface traffic.

39: cali99a6c2c8fff@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP mode DEFAULT group default 
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 5
    RX: bytes  packets  errors  dropped overrun mcast   
    1684652748 1060169  0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    745562168  1438535  0       0       0       0
  • RX(接收)方向:
    • 接收到 1.68 GB 的数据,约 106 万个数据包。
  • TX(发送)方向:
    • 发出了 745 MB 的数据,约 143 万个数据包。

接口处于正常运行状态,没有错误(errors=0)、丢包(dropped=0)或碰撞(collsns=0)。

iptables Routing

Check MySQL port/nodeport rules.

PREROUTING
  ↓
KUBE-SERVICES (匹配 dpt:18740)
  ↓
KUBE-EXT-XXXXXXX(你当前看到的)
  ↓
KUBE-SVC-XXXXXXXX(Service 跳转)
  ↓
KUBE-SEP-XXXXXXXX(Pod 的 Endpoint 跳转)
  ↓
DNAT 到 Pod IP:Port

iptables -t nat -L -n -v | grep 3306/18740
    0     0 KUBE-EXT-YPK62LGMC5XZ5MSF  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* mysql-haproxy:mysql */ tcp dpt:18740

iptables -t nat -L KUBE-EXT-YPK62LGMC5XZ5MSF -n -v
Chain KUBE-EXT-YPK62LGMC5XZ5MSF (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* masquerade traffic for mysql-haproxy:mysql external destinations */
    0     0 KUBE-SVC-YPK62LGMC5XZ5MSF  all  --  *      *       0.0.0.0/0            0.0.0.0/0

iptables -t nat -L KUBE-SVC-YPK62LGMC5XZ5MSF -n -v
Chain KUBE-SVC-YPK62LGMC5XZ5MSF (2 references)
 pkts bytes target     prot opt in     out     source               destination
   37  2220 KUBE-SEP-XJWUD6DL22O65XMD  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* mysql-haproxy:mysql -> 10.244.227.101:3306 */

iptables -t nat -L KUBE-SEP-XJWUD6DL22O65XMD -n -v
Chain KUBE-SEP-XJWUD6DL22O65XMD (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.244.227.101       0.0.0.0/0            /* mysql-haproxy:mysql */
   38  2280 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* mysql-haproxy:mysql */ tcp to::0 persistent:0 persistent random persistent
  • 38 2280: Traffic data.
  • to::0: Why is not the pod ip?