Illumina Innovates with Rancher and Kubernetes
The commands/steps listed on this page can be used to check name resolution issues in your cluster.
Make sure you configured the correct kubeconfig (for example, export KUBECONFIG=$PWD/kube_config_rancher-cluster.yml for Rancher HA) or are using the embedded kubectl via the UI.
export KUBECONFIG=$PWD/kube_config_rancher-cluster.yml
Before running the DNS checks, check the default DNS provider for your cluster and make sure that the overlay network is functioning correctly as this can also be the reason why DNS resolution (partly) fails.
kubectl -n kube-system get pods -l k8s-app=kube-dns
Example output when using CoreDNS:
NAME READY STATUS RESTARTS AGE coredns-799dffd9c4-6jhlz 1/1 Running 0 76m
Example output when using kube-dns:
NAME READY STATUS RESTARTS AGE kube-dns-5fd74c7488-h6f7n 3/3 Running 0 4m13s
kubectl -n kube-system get svc -l k8s-app=kube-dns
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP 4m13s
Check if internal cluster names are resolving (in this example, kubernetes.default), the IP shown after Server: should be the same as the CLUSTER-IP from the kube-dns service.
kubernetes.default
Server:
CLUSTER-IP
kube-dns
kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup kubernetes.default
Example output:
Server: 10.43.0.10 Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local Name: kubernetes.default Address 1: 10.43.0.1 kubernetes.default.svc.cluster.local pod "busybox" deleted
Check if external names are resolving (in this example, www.google.com)
www.google.com
kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup www.google.com
Server: 10.43.0.10 Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local Name: www.google.com Address 1: 2a00:1450:4009:80b::2004 lhr35s04-in-x04.1e100.net Address 2: 216.58.211.100 ams15s32-in-f4.1e100.net pod "busybox" deleted
If you want to check resolving of domain names on all of the hosts, execute the following steps:
Save the following file as ds-dnstest.yml
ds-dnstest.yml
apiVersion: apps/v1 kind: DaemonSet metadata: name: dnstest spec: selector: matchLabels: name: dnstest template: metadata: labels: name: dnstest spec: tolerations: - operator: Exists containers: - image: busybox:1.28 imagePullPolicy: Always name: alpine command: ["sh", "-c", "tail -f /dev/null"] terminationMessagePath: /dev/termination-log
Launch it using kubectl create -f ds-dnstest.yml
kubectl create -f ds-dnstest.yml
Wait until kubectl rollout status ds/dnstest -w returns: daemon set "dnstest" successfully rolled out.
kubectl rollout status ds/dnstest -w
daemon set "dnstest" successfully rolled out
Configure the environment variable DOMAIN to a fully qualified domain name (FQDN) that the host should be able to resolve (www.google.com is used as an example) and run the following command to let each container on every host resolve the configured domain name (it’s a single line command).
DOMAIN
export DOMAIN=www.google.com; echo "=> Start DNS resolve test"; kubectl get pods -l name=dnstest --no-headers -o custom-columns=NAME:.metadata.name,HOSTIP:.status.hostIP | while read pod host; do kubectl exec $pod -- /bin/sh -c "nslookup $DOMAIN > /dev/null 2>&1"; RC=$?; if [ $RC -ne 0 ]; then echo $host cannot resolve $DOMAIN; fi; done; echo "=> End DNS resolve test"
When this command has finished running, the output indicating everything is correct is:
=> Start DNS resolve test => End DNS resolve test
If you see error in the output, that means that the mentioned host(s) is/are not able to resolve the given FQDN.
Example error output of a situation where host with IP 209.97.182.150 had the UDP ports blocked.
=> Start DNS resolve test command terminated with exit code 1 209.97.182.150 cannot resolve www.google.com => End DNS resolve test
Cleanup the alpine DaemonSet by running kubectl delete ds/dnstest.
kubectl delete ds/dnstest
kubectl -n kube-system logs -l k8s-app=kube-dns
CoreDNS configuration is stored in the configmap coredns in the kube-system namespace.
coredns
kube-system
kubectl -n kube-system get configmap coredns -o go-template={{.data.Corefile}}
By default, the configured nameservers on the host (in /etc/resolv.conf) will be used as upstream nameservers for CoreDNS. You can check this file on the host or run the following Pod with dnsPolicy set to Default, which will inherit the /etc/resolv.conf from the host it is running on.
/etc/resolv.conf
dnsPolicy
Default
kubectl run -i --restart=Never --rm test-${RANDOM} --image=ubuntu --overrides='{"kind":"Pod", "apiVersion":"v1", "spec": {"dnsPolicy":"Default"}}' -- sh -c 'cat /etc/resolv.conf'
Enabling query logging can be done by enabling the log plugin in the Corefile configuration in the configmap coredns. You can do so by using kubectl -n kube-system edit configmap coredns or use the command below to replace the configuration in place:
kubectl -n kube-system edit configmap coredns
kubectl get configmap -n kube-system coredns -o json | kubectl get configmap -n kube-system coredns -o json | sed -e 's_loadbalance_log\\n loadbalance_g' | kubectl apply -f -
All queries will now be logged and can be checked using the command in Check CoreDNS logging.
By default, the configured nameservers on the host (in /etc/resolv.conf) will be used as upstream nameservers for kube-dns. Sometimes the host will run a local caching DNS nameserver, which means the address in /etc/resolv.conf will point to an address in the loopback range (127.0.0.0/8) which will be unreachable by the container. In case of Ubuntu 18.04, this is done by systemd-resolved. Since Rancher v2.0.7, we detect if systemd-resolved is running, and will automatically use the /etc/resolv.conf file with the correct upstream nameservers (which is located at /run/systemd/resolve/resolv.conf).
127.0.0.0/8
systemd-resolved
/run/systemd/resolve/resolv.conf
Use the following command to check the upstream nameservers used by the kubedns container:
kubectl -n kube-system get pods -l k8s-app=kube-dns --no-headers -o custom-columns=NAME:.metadata.name,HOSTIP:.status.hostIP | while read pod host; do echo "Pod ${pod} on host ${host}"; kubectl -n kube-system exec $pod -c kubedns cat /etc/resolv.conf; done
Pod kube-dns-667c7cb9dd-z4dsf on host x.x.x.x nameserver 1.1.1.1 nameserver 8.8.4.4
If the output shows an address in the loopback range (127.0.0.0/8), you can correct this in two ways:
kubelet
extra_args
/run/resolvconf/resolv.conf
services: kubelet: extra_args: resolv-conf: "/run/resolvconf/resolv.conf"
Note: As the kubelet is running inside a container, the path for files located in /etc and /usr are in /host/etc and /host/usr inside the kubelet container.
/etc
/usr
/host/etc
/host/usr
See Editing Cluster as YAML how to apply this change. When the provisioning of the cluster has finished, you have to remove the kube-dns pod to activate the new setting in the pod:
kubectl delete pods -n kube-system -l k8s-app=kube-dns pod "kube-dns-5fd74c7488-6pwsf" deleted
Try to resolve name again using Check if domain names are resolving.
If you want to check the kube-dns configuration in your cluster (for example, to check if there are different upstream nameservers configured), you can run the following command to list the kube-dns configuration:
kubectl -n kube-system get configmap kube-dns -o go-template='{{range $key, $value := .data}}{{ $key }}{{":"}}{{ $value }}{{"\n"}}{{end}}'
upstreamNameservers:["1.1.1.1"]