INFO: Environment: CATTLE_ADDRESS=10.1xx.xx.xx CATTLE_AGENT_CONNECT=true CATTLE_CA_CHECKSUM=99e6ccda7c91855xxxxxxxxx4f760c0278713b95b30ab0616b66df1a CATTLE_CLUSTER=false CATTLE_INTERNAL_ADDRESS= CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cncxxxx060vl CATTLE_SERVER=https://rancher.xxxx.com INFO: Using resolv.conf: nameserver 10.1xx.xx.xx nameserver 10.1xx.xx.xx search lnx.fxxxxx fmcxxxxx.cn INFO: https://rancher.xxxx.com/ping is accessible INFO: rancher.xxxx.com resolves to 10.1xx.xx.xx INFO: Value from https://rancher.xxxx.com/v3/settings/cacerts is an x509 certificate time="2022-09-17T06:10:47Z" level=info msg="Rancher agent version v2.4.8 is starting" time="2022-09-17T06:10:47Z" level=info msg="Listening on /tmp/log.sock" time="2022-09-17T06:10:47Z" level=info msg="Option customConfig=map[address:10.1xx.xx.xx internalAddress: label:map[] roles:[] taints:[]]" time="2022-09-17T06:10:47Z" level=info msg="Option etcd=false" time="2022-09-17T06:10:47Z" level=info msg="Option controlPlane=false" time="2022-09-17T06:10:47Z" level=info msg="Option worker=false" time="2022-09-17T06:10:47Z" level=info msg="Option requestedHostname=cncxxxx060vl" time="2022-09-17T06:10:47Z" level=info msg="Connecting to wss://rancher.xxxx.com/v3/connect with token ks5rgcxxxxxxxpkb7nd2zj4qsk6snclcxqnn" time="2022-09-17T06:10:47Z" level=info msg="Connecting to proxy" url="wss://rancher.xxxx.com/v3/connect" time="2022-09-17T06:10:47Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:10:49Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:10:51Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:10:53Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:10:55Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:10:57Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:10:59Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:11:01Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first" time="2022-09-17T06:11:03Z" level=info msg="Waiting for node to register. Either cluster is not ready for registering or etcd and controlplane node have to be registered first"
如上日志所示,对于 rancher custom 集群,有时候在 node agent pod 中可以看到有 Waiting for node to register 的日志信息。出现这个日志后,说明当前节点没有正常注册到 rancher 中。没有注册到 rancher 中,对于后期 custom 集群版本升级,这个节点上的基础组件将无法正常升级。虽然它没有正常注册到 rancher 中,但是它是正常注册到底层的 k8s 中,因为它不影响 k8s 的业务 pod 创建等操作。
执行以下命令,查看 cluster id、node id、node ip 之间的对应关系。根据报 Waiting for node to register 日志对应 pod 所在节点的 ip,找到相应的 cluster id、node id。
kubectl get nodes.management.cattle.io -A \ -o=custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,IP:spec.customConfig.address,HostnameOverride:.spec.requestedHostname
然后找一个相同集群下正常节点和不正常节点,分别执行以下命令打印节点的配置 YAML。
kubectl get nodes.management.cattle.io -n c-xxx m-xxxx -oyaml