构建Kubernetes集群

这里使用kubeadm方式去构建K8s集群

Kubeadm也是一个工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。

官方地址:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm/ 

获取镜像

下载镜像

docker pull k8s.gcr.io/kube-apiserver:v1.16.1
docker pull k8s.gcr.io/kube-proxy:v1.16.1
docker pull k8s.gcr.io/kube-controller-manager:v1.16.1
docker pull k8s.gcr.io/kube-scheduler:v1.16.1
docker pull k8s.gcr.io/etcd:3.3.15
docker pull k8s.gcr.io/pause:3.1
docker pull k8s.gcr.io/coredns:1.6.2

可能会遇到谷歌的k8s镜像库下载不下来的情况,所以这一步先不做

【推荐】拉取特定 Kubernetes 版本所需的容器镜像

kubeadm config images pull --kubernetes-version=v1.22.2 --image-repository registry.aliyuncs.com/google_containers

如果使用这种方法,在使用kubeadm初始化时,加上

--image-repository registry.aliyuncs.com/google_containers

特别说明

所有机器都必须有镜像
每次部署都会有版本更新,具体版本要求,运行初始化过程失败会有版本提示
kubeadm的版本和镜像的版本必须是对应的

安装docker (都做)

yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
yum install -y yum-utils device-mapper-persistent-data lvm2 git
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum install docker-ce -y
systemctl start docker && systemctl enable docker

通过阿里仓库下载镜像 (都做)

[root@k8s-master ~]# cat pull.sh 
#!/usr/bin/bash
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5

下载完了之后需要将aliyun下载下来的所有镜像打成google的tag

[root@k8s-master ~]# cat tag.sh 
#!/usr/bin/bash
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.2 k8s.gcr.io/kube-controller-manager:v1.22.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.2 k8s.gcr.io/kube-proxy:v1.22.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.2 k8s.gcr.io/kube-apiserver:v1.22.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.2 k8s.gcr.io/kube-scheduler:v1.22.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.4 k8s.gcr.io/coredns/coredns:v1.8.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0 k8s.gcr.io/etcd:3.5.0-0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 k8s.gcr.io/pause:3.5

给予脚本执行权限

chmod +x pull.sh tag.sh

先执行pull.sh镜像拉取完成后再执行tag.sh

开始安装

环境准备(都做)

我准备了三台机器

192.168.60.134 k8s-master
192.168.60.135 k8s-node-1
192.168.60.136 k8s-node-2

制作本地解析,修改主机名,相互解析

#将上面内容IP替换为自己的写入
vim /etc/hosts

所有机器系统配置

1.关闭防火墙:
# systemctl stop firewalld
# systemctl disable firewalld
2.禁用SELinux:
# setenforce 0
3.编辑文件/etc/selinux/config,将SELINUX修改为disabled,如下:
# sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
SELINUX=disabled 
4.时间同步
# yum install -y ntpdate
# ntpdate ntp.aliyun.com

关闭系统swap分区1.5版本之后的新规定

Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。方法一,通过kubelet的启动参数–fail-swap-on=false更改这个限制。方法二,关闭系统的Swap

#临时关闭
[root@localhost /]# swapoff -a 
修改/etc/fstab文件,注释掉SWAP的自动挂载,使用free -m确认swap已经关闭。
2.注释掉swap分区:
#永久关闭
[root@localhost /]# sed -i 's/.*swap.*/#&/' /etc/fstab
#查看
[root@localhost /]# free -m
              total        used        free      shared  buff/cache   available
Mem:           3935         144        3415           8         375        3518
Swap:             0           0           0

使用Kubeadm部署Kubernetes(都做)

配置源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
  • 安装最新版本(不使用这个)

    yum makecache fast
    
    yum install -y kubelet kubeadm kubectl ipvsadm
    
  1. 安装对应版本(使用这个)

    因为我需要安装1.22.2版本的k8s

    yum install -y kubelet-1.22.2-0.x86_64 kubeadm-1.22.2-0.x86_64 kubectl-1.22.2-0.x86_64 ipvsadm
    
  2. 加载ipvs相关内核模块

    如果重新开机,需要重新加载(可以写在 /etc/rc.local 中开机自动加载)

    # vim /etc/rc.local
    ==============================
    modprobe ip_vs
    modprobe ip_vs_rr
    modprobe ip_vs_wrr
    modprobe ip_vs_sh
    modprobe nf_conntrack_ipv4
    ==============================
    
  3. 给予执行权限

    chmod +x /etc/rc.local
    
  4. 重启

    reboot
    
  5. 配置:

    配置转发相关参数,否则可能会出错

    cat <<EOF >  /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    vm.swappiness=0
    EOF
    
  6. 使配置生效

    sysctl --system
    
  7. 如果net.bridge.bridge-nf-call-iptables报错,加载br_netfilter模块

    modprobe br_netfilter
    
    sysctl -p /etc/sysctl.d/k8s.conf
    
  8. 查看是否加载成功

    # lsmod | grep ip_vs
    ip_vs_sh               12688  0 
    ip_vs_wrr              12697  0 
    ip_vs_rr               12600  0 
    ip_vs                 145458  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
    nf_conntrack          139264  7 ip_vs,nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_conntrack_ipv4
    libcrc32c              12644  4 xfs,ip_vs,nf_nat,nf_conntrack
    

配置启动Kubelet(都做)

  • 配置kubelet使用pause镜像(不推荐)【直接执行下面的第一步】

    获取docker的systemd

    DOCKER_CGROUPS=$(docker info | grep 'Cgroup' | cut -d' ' -f4)
    
    echo $DOCKER_CGROUPS
    
  1. 配置变量:

    DOCKER_CGROUPS=`docker info |grep 'Cgroup' | awk ' NR==1 {print $3}'`
    
    [root@k8s-master ~]# echo $DOCKER_CGROUPS
    	cgroupfs
    
  2. 配置kubelet的cgroups

    • 国内的源会报错,我们用谷歌的

    • 国内(不推荐)

      cat >/etc/sysconfig/kubelet<<EOF
      KUBELET_EXTRA_ARGS="--cgroup-driver=$DOCKER_CGROUPS --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
      EOF
      
    • 谷歌(推荐)

      cat >/etc/sysconfig/kubelet<<EOF
      KUBELET_EXTRA_ARGS="--cgroup-driver=$DOCKER_CGROUPS --pod-infra-container-image=k8s.gcr.io/pause:3.5"
      EOF
      
  3. 启动

    systemctl daemon-reload
    
    systemctl enable kubelet && systemctl restart kubelet
    
    • 注意!!!(并不是有问题)

      在这里使用 # systemctl status kubelet,你会发现报错误信息;
      
      10月 11 00:26:43 node1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
      10月 11 00:26:43 node1 systemd[1]: Unit kubelet.service entered failed state.
      10月 11 00:26:43 node1 systemd[1]: kubelet.service failed.
      
      运行 # journalctl -xefu kubelet 命令查看systemd日志才发现,真正的错误是:
          unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory
      #这个错误在运行kubeadm init 生成CA证书后会被自动解决,此处可先忽略。
      #简单地说就是在kubeadm init 之前kubelet会不断重启。
      

配置master节点

运行初始化

[root@master]# kubeadm init --kubernetes-version=v1.22.2 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.60.134 --ignore-preflight-errors=Swap

注意:

apiserver-advertise-address=192.168.60.134    ---master的ip地址。
--kubernetes-version=v1.22.2   --更具具体版本进行修改
注意在检查一下swap分区是否关闭

如果报错会有版本提示,那就是有更新新版本了

[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 24.0.7. Latest validated version: 20.10
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.60.134]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.60.134 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.60.134 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 7.004532 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: lcuyzk.xydd3lli4u74a0uc
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.60.134:6443 --token lcuyzk.xydd3lli4u74a0uc \
	--discovery-token-ca-cert-hash sha256:5dd1dabc5c8843adf70384a3775559c453b47b8b53af071a96eae11390f4ec8a

上面记录了完成的初始化输出的内容,根据输出的内容基本上可以看出手动初始化安装一个Kubernetes集群所需要的关键步骤。 其中有以下关键内容:

[kubelet] 生成kubelet的配置文件”/var/lib/kubelet/config.yaml”
    [certificates]生成相关的各种证书
    [kubeconfig]生成相关的kubeconfig文件
    [bootstraptoken]生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到

打印的内容最后有这一段tokens

kubeadm join 192.168.60.134:6443 --token lcuyzk.xydd3lli4u74a0uc \
	--discovery-token-ca-cert-hash sha256:5dd1dabc5c8843adf70384a3775559c453b47b8b53af071a96eae11390f4ec8a

记得保存下来,每个人都不一样,且存在时间只有24小时

配置使用kubectl

master节点中操作:

其实可以在上面的内容中看到操作步骤

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

配置使用网络插件

master节点操作

  • 下载配置

    cd ~ && mkdir flannel && cd flannel
    
    curl -O https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    

    下载flannel文件,由于网站被墙了,所以可以试试我的加速器

    curl -O https://github.muou666.com/https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    
  • 进入配置文件vim kube-flannel.yml

    #搜索 image 找到下面这个,版本可能会不一样,根据配置文件内的版本来
    docker.io/flannel/flannel:v0.24.1
    #一般是在containers下
          containers:
          - name: kube-flannel
    👉       image: docker.io/flannel/flannel:v0.24.1
            command:
            - /opt/bin/flanneld
            args:
            - --ip-masq
            - --kube-subnet-mgr
    
    #这里注意kube-flannel.yml这个文件里的flannel的镜像是docker.io/flannel/flannel:v0.24.1 需要提前pull下来
    

    找到这个镜像,然后在所有机器上执行

    docker pull docker.io/flannel/flannel:v0.24.1
    

    配置网卡参数

    # 如果Node有多个网卡的话,参考https://github.com/kubernetes/kubernetes/issues/39701
    
    # 目前需要在kube-flannel.yml中使用--iface参数指定集群主机内网网卡的名称,否则可能会出现dns无法解析。容器无法通信的情况。
    
    #需要将kube-flannel.yml下载到本地,
    
    # flanneld启动参数加上--iface=<iface-name>
    
      containers:
    
       - name: kube-flannel
         image: quay.io/coreos/flannel:v0.12.0-amd64
         command:
         - /opt/bin/flanneld
           args:
         - --ip-masq
         - --kube-subnet-mgr
    =============================忽视
         - --iface=ens33	#这个就是要填入的网卡,我的是ens33,每个人都不一样
    =============================忽视
    ⚠️⚠️⚠️--iface=ens33 的值,是你当前的网卡,或者可以指定多网卡
    
    # 1.12版本的kubeadm额外给node1节点设置了一个污点(Taint):node.kubernetes.io/not-ready:NoSchedule,
    
    # 很容易理解,即如果节点还没有ready之前,是不接受调度的。可是如果Kubernetes的网络插件还没有部署的话,节点是不会进入ready状态的。
    
    # 因此修改以下kube-flannel.yaml的内容,加入对node.kubernetes.io/not-ready:NoSchedule这个污点的容忍:
    
       - key: beta.kubernetes.io/arch
         operator: In
         values:
           - arm64
             hostNetwork: true
             tolerations:
         - operator: Exists
           effect: NoSchedule
    =============================忽视这行线
         - key: node.kubernetes.io/not-ready  #添加如下三行---在165行左右
           operator: Exists
           effect: NoSchedule
    =============================忽视这行线
           serviceAccountName: flannel
    
    • 启动:
    # kubectl apply -f ~/flannel/kube-flannel.yml
    #启动完成之后需要等待一会
    
    • 查看:
    # kubectl get pods --namespace kube-system
    NAME                                 READY   STATUS    RESTARTS   AGE
    coredns-78fcd69978-4pwrd             1/1     Running   0          44m
    coredns-78fcd69978-cdcnh             1/1     Running   0          44m
    etcd-k8s-master                      1/1     Running   0          44m
    kube-apiserver-k8s-master            1/1     Running   0          44m
    kube-controller-manager-k8s-master   1/1     Running   0          44m
    kube-proxy-bwpsl                     1/1     Running   0          44m
    kube-scheduler-k8s-master            1/1     Running   0          44m
    # kubectl get service
    NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   45m
    # kubectl get svc --namespace kube-system
    NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
    kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   45m
    只有网络插件也安装配置完成之后,才能会显示为ready状态
    

配置node节点加入集群

注意是node节点

不包括master节点

  • 在所有node节点操作,此命令为初始化master成功后返回的结果

    也就是之前保存的tokens,每个人都不一样

    kubeadm join 192.168.60.134:6443 --token lcuyzk.xydd3lli4u74a0uc \
    	--discovery-token-ca-cert-hash sha256:5dd1dabc5c8843adf70384a3775559c453b47b8b53af071a96eae11390f4ec8a
    
  • 如果报错开启ip转发:

    sysctl -w net.ipv4.ip_forward=1
    

在master上操作

各种检测

  1. 查看pods

    [root@kub-k8s-master ~]# kubectl get pods -n kube-system
    
  2. 查看异常pod信息

    [root@kub-k8s-master ~]# kubectl  describe pods kube-flannel-ds-sr6tq -n  kube-system
    
  3. 查看节点

    [root@k8s-master flannel]# kubectl get nodes
    NAME         STATUS   ROLES                  AGE     VERSION
    k8s-master   Ready    control-plane,master   56m     v1.22.2
    k8s-node-1   Ready    <none>                 8m12s   v1.22.2
    k8s-node-2   Ready    <none>                 7m49s   v1.22.2
    

到此,集群配置完成!!!!!!!!!!!!!!!

错误整理

从网站搜集到的一些问题

错误
问题1:服务器时间不一致会报错
查看服务器时间
=====================================
问题2:kubeadm init不成功,发现如下提示,然后超时报错
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s

查看kubelet状态发现如下错误,主机master找不到和镜像下载失败,发现pause镜像是从aliyuncs下载的,其实我已经下载好了官方的pause镜像,按着提示的镜像名称重新给pause镜像打个ali的tag,最后重置kubeadm的环境重新初始化,错误解决
[root@master manifests]# systemctl  status kubelet -l
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since 四 2019-01-31 15:20:32 CST; 5min ago
     Docs: https://kubernetes.io/docs/
 Main PID: 23908 (kubelet)
    Tasks: 19
   Memory: 30.8M
   CGroup: /system.slice/kubelet.service
           └─23908 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --cgroup-driver=cgroupfs --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1

1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.432357   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.532928   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.633192   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: I0131 15:25:41.729296   23908 kubelet_node_status.go:278] Setting node annotation to enable volume controller attach/detach
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.733396   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740110   23908 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740153   23908 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740166   23908 kuberuntime_manager.go:662] createPodSandbox for pod "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740207   23908 pod_workers.go:190] Error syncing pod e8f43404e60ae844e375d50b1e39d91e ("kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)"), skipping: failed to "CreatePodSandbox" for "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)\" failed: rpc error: code = Unknown desc = failed pulling image \"registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1\": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument"
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.833981   23908 kubelet.go:2266] node "master" not found

解决方式:

重置kubeadm环境
整个集群所有节点(包括master)重置/移除节点
1.驱离k8s-node-1节点上的pod(master上)
[root@kub-k8s-master ~]# kubectl drain kub-k8s-node1 --delete-local-data --force --ignore-daemonsets

2.删除节点(master上)
[root@kub-k8s-master ~]# kubectl delete node kub-k8s-node1

3.重置节点(node上-也就是在被删除的节点上)
[root@kub-k8s-node1 ~]# kubeadm reset

注1:需要把master也驱离、删除、重置,这里给我坑死了,第一次没有驱离和删除master,最后的结果是查看结果一切正常,但coredns死活不能用,搞了整整1天,切勿尝试

注2:master上在reset之后需要删除如下文件
# rm -rf /var/lib/cni/ $HOME/.kube/config

###注意:如果整个k8s集群都做完了,需要重置按照上面步骤操作。如果是在初始化出错只需要操作第三步

重新生成token

kubeadm 生成的token过期后,集群增加节点

通过kubeadm初始化后,都会提供node加入的token:
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.246.166:6443 --token n38l80.y2icehgzsyuzkthi \
    --discovery-token-ca-cert-hash sha256:5fb6576ef82b5655dee285e0c93432aee54d38779bc8488c32f5cbbb90874bac
默认token的有效期为24小时,当过期之后,该token就不可用了。

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

解决方法:
1. 重新生成新的token:
[root@kub-k8s-master]# kubeadm  token create
kiyfhw.xiacqbch8o8fa8qj
[root@kub-k8s-master]# kubeadm  token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION   EXTRA GROUPS
gvvqwk.hn56nlsgsv11mik6   <invalid>   2018-10-25T14:16:06+08:00   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token
kiyfhw.xiacqbch8o8fa8qj   23h         2018-10-27T06:39:24+08:00   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token

2. 获取ca证书sha256编码hash值:
[root@kub-k8s-master]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057

3. 节点加入集群:
  kubeadm join 18.16.202.35:6443 --token kiyfhw.xiacqbch8o8fa8qj --discovery-token-ca-cert-hash sha256:5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057
几秒钟后,您应该注意到kubectl get nodes在主服务器上运行时输出中的此节点。

上面的方法比较繁琐,一步到位:
[root@kub-k8s-master ~]# kubeadm token create --print-join-command

第二种方法:
[root@kub-k8s-master ~]# token=$(kubeadm token generate)
kubeadm token create $token --print-join-command --ttl=0

然后在node节点执行
[root@kub-k8s-node1 ~]# kubeadm reset
[root@kub-k8s-node1 ~]# 执行生成得token