rke2 coredns 自定义配置

本文永久链接: https://www.xtplayer.cn/coredns/rke2-coredns-custom-config/

在部署 rke2 集群后，会自动部署一个 cluster agent 服务去连接 rancher server。对于内部没有 dns 服务器的开发环境，这个时候查看 cluster agent pod 日志，就会出现以下的错误日志：

可能你在主机的 /etc/hosts 中配置了 HostAliases，但是因为 cluster agent 是 pod/容器运行，它无法读取到主机的 HostAliases。

为了避免 cluster agent 无法连接 rancher server 导致 rke2 集群不可用，rke2 增加了自定义 corends 配置的功能。

执行以下命令查看 coredns 默认配置

root@rke2-1:~# kubectl  -n kube-system  get configmaps  rke2-coredns-rke2-coredns -oyaml
apiVersion: v1
data:
  Corefile: ".:53 {\n    errors \n    health  {\n        lameduck 5s\n    }\n    ready
    \n    kubernetes   cluster.local  cluster.local in-addr.arpa ip6.arpa {\n        pods
    insecure\n        fallthrough in-addr.arpa ip6.arpa\n        ttl 30\n    }\n    prometheus
    \  0.0.0.0:9153\n    forward   . /etc/resolv.conf\n    cache   30\n    loop \n
    \   reload \n    loadbalance \n}"
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: rke2-coredns
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2023-02-07T04:50:45Z"
  labels:
    app.kubernetes.io/instance: rke2-coredns
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: rke2-coredns
    helm.sh/chart: rke2-coredns-1.19.401
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: CoreDNS
  name: rke2-coredns-rke2-coredns
  namespace: kube-system
  resourceVersion: "1039432"
  uid: 8ed55660-07f1-47cf-9acc-fae32a392d5b
root@rke2-1:~#

访问 https://github.com/rancher/rke2-charts/blob/main/charts/rke2-coredns/rke2-coredns/1.19.401/values.yaml#L102 可以查看 coredns chart server 的默认参数配置
根据文档说明 https://docs.rancher.cn/docs/rke2/helm/_index#%E4%BD%BF%E7%94%A8-helmchartconfig-%E8%87%AA%E5%AE%9A%E4%B9%89%E6%89%93%E5%8C%85%E7%9A%84%E7%BB%84%E4%BB%B6，将默认的参数添加到 valuesContent 中，然后再添加一个 hosts 插件，完整配置示例如下：

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-coredns
  namespace: kube-system
spec:
  valuesContent: |-
    servers:
    - zones:
      - zone: example.org
      port: 53
      # If serviceType is nodePort you can specify nodePort here
      # nodePort: 30053
      plugins:
      - name: hosts
        configBlock: |-
          1.2.3.4 www.aaa.com
          fallthrough
      - name: errors
      # Serves a /health endpoint on :8080, required for livenessProbe
      - name: health
        configBlock: |-
          lameduck 5s
      # Serves a /ready endpoint on :8181, required for readinessProbe
      - name: ready
      # Required to query kubernetes API for data
      - name: kubernetes
        parameters: cluster.local in-addr.arpa ip6.arpa
        configBlock: |-
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
      # Serves a /metrics endpoint on :9153, required for serviceMonitor
      - name: prometheus
        parameters: 0.0.0.0:9153
      # https://coredns.io/plugins/forward/
      - name: forward
        parameters: . /etc/resolv.conf
        configBlock: |-
          # is the number of subsequent failed health checks that are needed before considering an upstream to be down. If 0, the upstream will never be marked as down (nor health checked). Default is 2.
          max_fails 2
          # expire (cached) connections after this time, the default is 10s
          expire 10s
          # policy default random, Optional: random|round_robin|sequential
          ## random is a policy that implements random upstream selection.
          ## round_robin is a policy that selects hosts based on round robin ordering.
          ## sequential is a policy that selects hosts based on sequential ordering.
          policy random
      - name: cache
        parameters: 30
      - name: loop
      - name: reload
      - name: loadbalance
      - name: log
    - zones:
      - zone: .
      port: 53
      # If serviceType is nodePort you can specify nodePort here
      # nodePort: 30053
      plugins:
      - name: hosts
        configBlock: |-
          1.2.3.4 www.xxx.com
          fallthrough
      - name: errors
      # Serves a /health endpoint on :8080, required for livenessProbe
      - name: health
        configBlock: |-
          lameduck 5s
      # Serves a /ready endpoint on :8181, required for readinessProbe
      - name: ready
      # Required to query kubernetes API for data
      - name: kubernetes
        parameters: cluster.local in-addr.arpa ip6.arpa
        configBlock: |-
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
      # Serves a /metrics endpoint on :9153, required for serviceMonitor
      - name: prometheus
        parameters: 0.0.0.0:9153
      # https://coredns.io/plugins/forward/
      - name: forward
        parameters: . /etc/resolv.conf
        configBlock: |-
          # is the number of subsequent failed health checks that are needed before considering an upstream to be down. If 0, the upstream will never be marked as down (nor health checked). Default is 2.
          max_fails 2
          # expire (cached) connections after this time, the default is 10s
          expire 10s
          # policy default random, Optional: random|round_robin|sequential
          ## random is a policy that implements random upstream selection.
          ## round_robin is a policy that selects hosts based on round robin ordering.
          ## sequential is a policy that selects hosts based on sequential ordering.
          policy random
      - name: cache
        parameters: 30
      - name: loop
      - name: reload
      - name: loadbalance
      - name: log

将配置保存为 yaml 文件，然后执行 kubectl apple -f xx.yaml
再次执行命令查看 coredns 配置文件:

root@rke2-1:~# kubectl  -n kube-system  get configmaps  rke2-coredns-rke2-coredns -oyaml
apiVersion: v1
data:
  Corefile: ".:53 {\n    hosts  {\n        1.2.3.4 www.xxx.com\n    }\n    errors
    \n    health  {\n        lameduck 5s\n    }\n    ready \n    kubernetes   cluster.local
    \ cluster.local in-addr.arpa ip6.arpa {\n        pods insecure\n        fallthrough
    in-addr.arpa ip6.arpa\n        ttl 30\n    }\n    prometheus   0.0.0.0:9153\n
    \   forward   . /etc/resolv.conf\n    cache   30\n    loop \n    reload \n    loadbalance
    \n}"
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: rke2-coredns
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2023-02-07T04:50:45Z"
  labels:
    app.kubernetes.io/instance: rke2-coredns
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: rke2-coredns
    helm.sh/chart: rke2-coredns-1.19.401
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: CoreDNS
  name: rke2-coredns-rke2-coredns
  namespace: kube-system
  resourceVersion: "1048131"
  uid: 8ed55660-07f1-47cf-9acc-fae32a392d5b
root@rke2-1:~#

测试

执行命令获取当前节点 coredns pod ip，然后通过 ping 测试与coredns pod ip的连通性，正常是可以 ping 通。

然后再执行 nslookup <coredns 中配置的目标域名> ，正常应该返回 coredns hosts 中配置的 ip。

root@rke2-1:~# kubectl  -n kube-system  get pod -owide
NAME                                                    READY   STATUS      RESTARTS        AGE   IP               NODE     NOMINATED NODE   READINESS GATES
cloud-controller-manager-rke2-1                         1/1     Running     15 (145m ago)   55d   192.168.31.141   rke2-1   <none>           <none>
etcd-rke2-1                                             1/1     Running     8 (145m ago)    55d   192.168.31.141   rke2-1   <none>           <none>
helm-install-rke2-calico-crd-m5rfm                      0/1     Completed   0               55d   192.168.31.141   rke2-1   <none>           <none>
helm-install-rke2-calico-s9wbg                          0/1     Completed   1               55d   192.168.31.141   rke2-1   <none>           <none>
helm-install-rke2-coredns-xjrn9                         0/1     Completed   0               93s   192.168.31.141   rke2-1   <none>           <none>
kube-apiserver-rke2-1                                   1/1     Running     8 (145m ago)    55d   192.168.31.141   rke2-1   <none>           <none>
kube-controller-manager-rke2-1                          1/1     Running     15 (145m ago)   55d   192.168.31.141   rke2-1   <none>           <none>
kube-proxy-rke2-1                                       1/1     Running     8 (145m ago)    55d   192.168.31.141   rke2-1   <none>           <none>
kube-scheduler-rke2-1                                   1/1     Running     8 (145m ago)    55d   192.168.31.141   rke2-1   <none>           <none>
rke2-coredns-rke2-coredns-7c8d4b7fc-sp75s               1/1     Running     0               92s   10.42.172.95     rke2-1   <none>           <none>
rke2-coredns-rke2-coredns-autoscaler-768bfc5985-kmvtg   1/1     Running     0               48m   10.42.172.83     rke2-1   <none>           <none>
root@rke2-1:~# ping 10.42.172.95
PING 10.42.172.95 (10.42.172.95) 56(84) bytes of data.
64 bytes from 10.42.172.95: icmp_seq=1 ttl=64 time=0.077 ms

64 bytes from 10.42.172.95: icmp_seq=2 ttl=64 time=0.067 ms
^C
--- 10.42.172.95 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1009ms
rtt min/avg/max/mdev = 0.067/0.072/0.077/0.005 ms
root@rke2-1:~# nslookup www.xxx.com 10.42.172.95
Server:		10.42.172.95
Address:	10.42.172.95#53

Name:	www.xxx.com
Address: 1.2.3.4

root@rke2-1:~#