无法启动Kubernetes Dashboard

82次阅读
没有评论

问题描述

在尝试在Ubuntu 20.04 TLS上安装带有Dashboard的Kubernetes集群时,遇到了问题。他按照以下命令进行操作:

swapoff -a
sudo apt update
sudo apt install docker.io
sudo systemctl start docker
sudo systemctl enable docker
sudo apt install apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" >> ~/kubernetes.list
sudo mv ~/kubernetes.list /etc/apt/sources.list.d
sudo apt update
sudo apt install kubeadm kubelet kubectl kubernetes-cni
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml
kubectl proxy --address 192.168.1.133 --accept-hosts '.*'

但是当他打开http://192.168.1.133:8001/api/v1/namespaces/default/services/https:kubernetes-dashboard:https/proxy时,出现以下错误:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "services \"kubernetes-dashboard\" not found",
  "reason": "NotFound",
  "details": {
    "name": "kubernetes-dashboard",
    "kind": "services"
  },
  "code": 404
}

他尝试列出Pods,发现kube-flannel的Pod处于错误状态:

kubectl get pods --all-namespaces

输出:

NAMESPACE              NAME                                         READY   STATUS              RESTARTS       AGE
kube-flannel           kube-flannel-ds-f6bwx                        0/1     Error               11 (29s ago)   76m
kube-system            coredns-6d4b75cb6d-rk4kq                     0/1     ContainerCreating   0              77m
kube-system            coredns-6d4b75cb6d-vkpcm                     0/1     ContainerCreating   0              77m
kube-system            etcd-ubuntukubernetis1                       1/1     Running             1 (52s ago)    77m
kube-system            kube-apiserver-ubuntukubernetis1             1/1     Running             1 (52s ago)    77m
kube-system            kube-controller-manager-ubuntukubernetis1    1/1     Running             1 (52s ago)    77m
kube-system            kube-proxy-n6ldq                             1/1     Running             1 (52s ago)    77m
kube-system            kube-scheduler-ubuntukubernetis1             1/1     Running             1 (52s ago)    77m
kubernetes-dashboard   dashboard-metrics-scraper-7bfdf779ff-sdnc8   0/1     Pending             0              75m
kubernetes-dashboard   dashboard-metrics-scraper-8c47d4b5d-2sxrb    0/1     Pending             0              59m
kubernetes-dashboard   kubernetes-dashboard-5676d8b865-fws4j        0/1     Pending             0              59m
kubernetes-dashboard   kubernetes-dashboard-6cdd697d84-nmpv2        0/1     Pending             0              75m

他尝试查看kube-flannel的Pod日志:

kubectl logs -n kube-flannel kube-flannel-ds-f6bwx -p

输出:

Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
I0724 14:49:57.782499       1 main.go:207] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
W0724 14:49:57.782676       1 client_config.go:614] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
E0724 14:49:57.892230       1 main.go:224] Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-f6bwx': pods "kube-flannel-ds-f6bwx" is forbidden: User "system:serviceaccount:kube-flannel:flannel" cannot get resource "pods" in API group "" in the namespace "kube-flannel"

用户想知道如何解决这个问题。

解决方案

请注意以下操作注意版本差异及修改前做好备份。

步骤1

首先,你查询的URL是错误的。根据你应用的最后一个yaml文件和你的kubectl get pods -A,可以猜测kubernetes-dashboard Service位于kubernetes-dashboard命名空间,而不是默认的命名空间。
如果你只是想连接Kubernetes Dashboard,可以使用以下命令代替kubectl proxy命令,然后在浏览器中打开https://localhost:8443

kubectl port-forward -n kubernetes-dashboard deploy/kubernetes-dashboard 8443:8443

步骤2

其次,你的SDN出现了问题。在你的kubectl get pods输出中,我们可以看到kube-flannel的Pod处于错误状态。
查看这个容器的日志,尝试找出它为什么无法启动:

kubectl logs -n kube-flannel kube-flannel-ds-xxxx -p

多年来我没有设置flannel了,但我记得除了应用它们的RBAC和daemonsets yaml文件之外,我还需要修补节点,为它们分配一个CIDR。例如:kubectl patch node my-node-1 -p '{ "spec": { "podCIDR": "10.32.3.0/24" } }' --type merge(每个podCIDR必须是唯一的,每个节点都有自己的范围来托管Pods。如果我没记错,每个podCIDR必须是flannel的net-conf.json网络子网的子集,请查看安装flannel时创建的ConfigMap)。
至于你的最后一条评论,错误告诉我们以下内容:

Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-f6bwx': pods "kube-flannel-ds-f6bwx" is forbidden: User "system:serviceaccount:kube-flannel:flannel" cannot get resource "pods" in API group "" in the namespace "kube-flannel"

回顾一下你用于设置flannel的文件,你提到了一个设置在kube-flannel命名空间中的文件,然后是一个为kube-system中的serviceaccount创建clusterrolebinding的rbac文件。为了修复你的SDN问题,你可以创建以下内容:

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel-fix
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
  - kind: ServiceAccount
    name: flannel
    namespace: kube-flannel

顺便说一下:在你的情况下,kube-flannel-rbac是不必要的。如果你从他们的旧版清单安装flannel,它将是必需的(https://github.com/flannel-io/flannel/blob/master/Documentation/k8s-manifests/kube-flannel-legacy.yml)。在你的情况下,我们正在修复的ClusterRoleBinding应该已经正确创建,只需应用https://github.com/flannel-io/flannel/blob/master/Documentation/kube-flannel.yml即可。

正文完