Communication Errors
Background
- ✅ MySQL can connect from its own pod.
- ❌ MySQL can't connect by K8s service.
- Service endpoint is right.
getent
inside pod, sometimes failed- Pod network issue(Calico CNI).
Errors
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
...
The driver has not received any packets from the server
2025-07-29T02:45:56.595368Z 21528 [Note] Aborted connection 21528 to db: 'foo' user: 'root' host: '10.x.x.x' (Got an error reading communication packets)
Solutions
Check MySQL Inside
kubectl exec -it mysql-0 -- /bin/bash
mysql -uroot -p
Check Pod Network
If it sometimes fails, some CNI (Calico) pods might be down.
getent ahosts mysql-haproxy.ns.svc.cluster.local
10.x.x.x STREAM mysql-haproxy.ns.svc.cluster.local
10.x.x.x DGRAM
10.x.x.x RAW
Check CNI(Calico) Pods
kubectl get pods | grep calico
calico-kube-controllers-6f697bd6bc-8s6df 1/1 Running 1 (17h ago) 27d
calico-node-jq9lc 0/1 Running 0 6s
calico-node-ms4wt 1/1 Running 2 (16h ago) 27d
calico-node-q4z9q 1/1 Running 2 (16h ago) 27d
Delete calico-node-jq9lc
to fix the problem.
kubectl delete pod calico-node-jq9lc
Calico Error Logs
kubectl logs calico-node-ms4wt --previous
2025-07-28 09:20:42.273 [WARNING][96] felix/route_table.go 740: Failed to delete route error=no such process ifaceName="cali125e9a7ccf9" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:20:44.002 [INFO][96] felix/route_table.go 1205: Failed to access interface because it doesn't exist. error=Link not found ifaceName="cali6ac61bc7d83" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:20:44.002 [INFO][96] felix/route_table.go 1273: Failed to get interface; it's down/gone. error=Link not found ifaceName="cali6ac61bc7d83" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:21:08.572 [INFO][96] felix/route_table.go 1205: Failed to access interface because it doesn't exist. error=Link not found ifaceName="califb58b76f486" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:21:08.572 [INFO][96] felix/route_table.go 1273: Failed to get interface; it's down/gone. error=Link not found ifaceName="califb58b76f486" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=0
2025-07-28 09:50:13.390 [INFO][98] confd/watchercache.go 125: Watch error received from Upstream ListRoot="/calico/ipam/v2/host/ingress-2" error=too old resource version: 1057 (7933983)
Failed to Get Interface
Failed to get interface; it's down/gone. error=Link not found ifaceName="cali125e9a7ccf9"
Calico 想处理某个接口(如 cali125e9a7ccf9
),但内核返回接口已不存在。
- 相关 Pod 被删除,但 Calico 还没完全清理对应路由。
- 或 Pod 快速重建时,旧接口消失,Felix 还没感知。
Failed to Delete Route
Failed to delete route error=no such process ifaceName="cali125e9a7ccf9"
Calico 想删掉某个与该接口绑定的路由,但系统报错 no such process
。
这是一种无害的警告,意味着系统状态已提前变更,Calico 的清理动作重复了一次。
Too Old Resource Version
Watch error received from Upstream ListRoot="/calico/ipam/v2/host/ingress-2" error=too old resource version
confd
(Calico 内部的配置同步模块)从 etcd
(或 kube-apiserver
)watch 数据失败,提示 resource version 太旧。
这是 Kubernetes watch 机制的正常现象,如果版本太老(如本地断网或负载高导致滞后),会断掉重新全量拉取。
Calico Troubleshooting
Pod Network Interface
ip route | grep cali
10.244.181.190 dev calia3bb7ac8316 scope link
字段 | 含义 |
---|---|
10.244.181.190 |
目标 Pod 的 IP 地址(属于 Pod CIDR) |
dev calia3bb7ac8316 |
通过的网络接口(Calico 自动创建的 veth 接口) |
scope link |
表示该路由只在本地链路上有效,即直接连接的路由 |
列出了 Linux 路由表中所有通过 cali*
接口的路由,而 cali*
是 Calico CNI 插件为每个 Pod 创建的虚拟网络接口(veth pair
)的一端。
这些路由表示本机(如这个节点)上运行的 Pod,其 Pod IP 地址通过 Calico 创建的虚拟接口进行访问。
换句话说:这是 Calico 的工作机制,每个 Pod 被分配一个 IP,然后 Calico 给每个 Pod 创建一个 veth 网络接口,Linux 的路由表就加了一条通往这个 IP 的路由,通过相应的 caliXXXX 接口发包。
kubectl get po -o wide -A | grep 10.244.181.190
Network Traffic
XDP(eXpress Data Path), if disabled, the data is 0.
ethtool -S calia3bb7ac8316
NIC statistics:
peer_ifindex: 4
rx_queue_0_xdp_packets: 0
rx_queue_0_xdp_bytes: 0
rx_queue_0_xdp_drops: 0
Check interface details.
ip link show cali99a6c2c8fff
39: cali99a6c2c8fff@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP mode DEFAULT group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 5
Check interface traffic.
39: cali99a6c2c8fff@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP mode DEFAULT group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 5
RX: bytes packets errors dropped overrun mcast
1684652748 1060169 0 0 0 0
TX: bytes packets errors dropped carrier collsns
745562168 1438535 0 0 0 0
- RX(接收)方向:
- 接收到 1.68 GB 的数据,约 106 万个数据包。
- TX(发送)方向:
- 发出了 745 MB 的数据,约 143 万个数据包。
接口处于正常运行状态,没有错误(errors=0)、丢包(dropped=0)或碰撞(collsns=0)。
iptables Routing
Check MySQL port/nodeport rules.
PREROUTING
↓
KUBE-SERVICES (匹配 dpt:18740)
↓
KUBE-EXT-XXXXXXX(你当前看到的)
↓
KUBE-SVC-XXXXXXXX(Service 跳转)
↓
KUBE-SEP-XXXXXXXX(Pod 的 Endpoint 跳转)
↓
DNAT 到 Pod IP:Port
iptables -t nat -L -n -v | grep 3306/18740
0 0 KUBE-EXT-YPK62LGMC5XZ5MSF tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* mysql-haproxy:mysql */ tcp dpt:18740
iptables -t nat -L KUBE-EXT-YPK62LGMC5XZ5MSF -n -v
Chain KUBE-EXT-YPK62LGMC5XZ5MSF (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ all -- * * 0.0.0.0/0 0.0.0.0/0 /* masquerade traffic for mysql-haproxy:mysql external destinations */
0 0 KUBE-SVC-YPK62LGMC5XZ5MSF all -- * * 0.0.0.0/0 0.0.0.0/0
iptables -t nat -L KUBE-SVC-YPK62LGMC5XZ5MSF -n -v
Chain KUBE-SVC-YPK62LGMC5XZ5MSF (2 references)
pkts bytes target prot opt in out source destination
37 2220 KUBE-SEP-XJWUD6DL22O65XMD all -- * * 0.0.0.0/0 0.0.0.0/0 /* mysql-haproxy:mysql -> 10.244.227.101:3306 */
iptables -t nat -L KUBE-SEP-XJWUD6DL22O65XMD -n -v
Chain KUBE-SEP-XJWUD6DL22O65XMD (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ all -- * * 10.244.227.101 0.0.0.0/0 /* mysql-haproxy:mysql */
38 2280 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* mysql-haproxy:mysql */ tcp to::0 persistent:0 persistent random persistent
38 2280
: Traffic data.to::0
: Why is not the pod ip?
No comments to display
No comments to display