首页 新闻 赞助 找找看

kubernetes升级问题"error syncing endpoints with etc: context deadline exceeded"

0
悬赏园豆:30 [已解决问题] 解决于 2021-01-16 10:46

准备将 k8s 从 1.17.0 升级至 1.18.0,运行下面的命令

kubeadm upgrade plan --ignore-preflight-errors=CoreDNSUnsupportedPlugins

在"Running cluster health checks"时报错

[upgrade] Running cluster health checks
error syncing endpoints with etc: context deadline exceeded

请问如何解决?

k8s
问题补充:

通过下面的命令发现是 etcd 容器没启动起来

docker ps | grep etcd 

修复 /etc/kubernetes/manifests/etcd.yaml 中的错误配置,etcd 容器成功启动,但问题依旧

[upgrade] Running cluster health checks
I0115 18:49:48.762279    9793 health.go:158] Creating Job "upgrade-health-check" in the namespace "kube-system"
I0115 18:49:48.792884    9793 health.go:188] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0115 18:49:49.794837    9793 health.go:188] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0115 18:49:50.794333    9793 health.go:188] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0115 18:49:51.829966    9793 health.go:195] Job "upgrade-health-check" in the namespace "kube-system" completed
I0115 18:49:51.830007    9793 health.go:201] Deleting Job "upgrade-health-check" in the namespace "kube-system"
I0115 18:49:51.839520    9793 etcd.go:178] retrieving etcd endpoints from "kubeadm.kubernetes.io/etcd.advertise-client-urls" annotation in etcd Pods
I0115 18:49:51.847027    9793 etcd.go:192] etcd Pod "etcd-k8s-master0" is missing the "kubeadm.kubernetes.io/etcd.advertise-client-urls" annotation; cannot infer etcd advertise client URL using the Pod annotation
I0115 18:49:51.847136    9793 etcd.go:202] retrieving etcd endpoints from the cluster status
I0115 18:49:51.849472    9793 etcd.go:102] etcd endpoints read from pods: https://10.0.9.171:2379
context deadline exceeded
error syncing endpoints with etc

在 master 上安装 etcdctl 命令

wget -c https://github.com/etcd-io/etcd/releases/download/v3.4.14/etcd-v3.4.14-linux-amd64.tar.gz
mv etcdctl /usr/bin/etcdctl

然后用 etcdctl 命令连接 etcd

etcdctl --endpoints 10.0.9.171:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list

从返回的错误找到了问题的原因

{"level":"warn","ts":"2021-01-16T08:24:50.026+0800","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-efd79c04-7e43-492b-bbd5-defe5b400e68/10.0.9.171:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: authentication handshake failed: x509: certificate is valid for 10.0.1.81, 127.0.0.1, ::1, not 10.0.9.171\""}

原来是证书问题

dudu的主页 dudu | 高人七级 | 园豆:31075
提问于:2021-01-15 18:37
< >
分享
最佳答案
0

重新生成 etcd-server 证书后问题解决

cd /etc/kubernetes/pki/etcd
rm server.crt server.key
kubeadm init phase certs etcd-server
dudu | 高人七级 |园豆:31075 | 2021-01-16 10:46
清除回答草稿
   您需要登录以后才能回答,未注册用户请先注册