kubernetes部署(containerd版)

  • A+
所属分类:k8s 总复习

使用 kubeadm 从头搭建一个使用 containerd 作为容器运行时的 Kubernetes 集群,这里我们安装 v1.22.1 版本。

环境准备

kubernetes部署(containerd版)
kubernetes部署(containerd版)

节点的 hostname 必须使用标准的 DNS 命名,另外千万不用什么默认的 localhost 的 hostname,会导致各种错误出现的。在 Kubernetes 项目里,机器的名字以及一切存储在 Etcd 中的 API 对象,都必须使用标准的 DNS 命名(RFC 1123)。可以使用命令 hostnamectl set-hostname node1 来修改 hostname。

每个节点都做:

[root@master1 ~]# systemctl disable firewalld --now
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@master1 ~]# setenforce 0
[root@master1 ~]# sed -ir "/^SELINUX=/c SELINUX=disabled" /etc/selinux/config
[root@master1 ~]# modprobe br_netfilter

[root@master1 ~]# vim /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1

[root@master1 ~]# sysctl -p /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1

[root@master1 ~]# cat > /etc/sysconfig/modules/ipvs.modules <<EOF
> #!/bin/bash
> modprobe -- ip_vs
> modprobe -- ip_vs_rr
> modprobe -- ip_vs_wrr
> modprobe -- ip_vs_sh
> modprobe -- nf_conntrack_ipv4
> EOF


[root@master1 ~]# chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
nf_conntrack_ipv4      15053  0
nf_defrag_ipv4         12729  1 nf_conntrack_ipv4
ip_vs_sh               12688  0
ip_vs_wrr              12697  0
ip_vs_rr               12600  0
ip_vs                 145458  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack          139264  2 ip_vs,nf_conntrack_ipv4
libcrc32c              12644  3 xfs,ip_vs,nf_conntrack

[root@master1 ~]# yum install -y ipset ipvsadm
[root@master1 ~]# yum install -y chrony
[root@master1 ~]# systemctl enable chronyd --now
[root@master1 ~]# chronyc sources
210 Number of sources = 4
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^- a.chl.la                      2   8   335    99    +37ms[  +37ms] +/-  163ms
^* 139.199.214.202               2   9   377   487  +1531us[+1924us] +/-   34ms
^+ time.neu.edu.cn               1   9   377   373   +463us[ +463us] +/-   35ms
^+ time.neu.edu.cn               1   8   377   102   -710us[ -710us] +/-   36ms

[root@master1 ~]# swapoff -a
[root@master1 ~]# sed -ir '/ swap / s/^\(.*\)$/#\1/' /etc/fstab

[root@master1 ~]# cat /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness=0

[root@master1 ~]# sysctl -p /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0

bridge-nf 使得 netfilter 可以对 Linux 网桥上的 IPv4/ARP/IPv6 包过滤。比如,设置net.bridge.bridge-nf-call-iptables=1后,二层的网桥在转发包时也会被 iptables的 FORWARD 规则所过滤。常用的选项包括:
net.bridge.bridge-nf-call-arptables:是否在 arptables 的 FORWARD 中过滤网桥的 ARP 包
net.bridge.bridge-nf-call-ip6tables:是否在 ip6tables 链中过滤 IPv6 包
net.bridge.bridge-nf-call-iptables:是否在 iptables 链中过滤 IPv4 包
net.bridge.bridge-nf-filter-vlan-tagged:是否在 iptables/arptables 中过滤打了 vlan 标签的包。

安装 Containerd

[root@master1 ~]# rpm -qa | grep libseccomp
libseccomp-2.3.1-4.el7.x86_64
[root@master1 ~]# rpm -e libseccomp-2.3.1-4.el7.x86_64 --nodeps
[root@master1 ~]# wget http://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/libseccomp-2.5.1-1.el8.x86_64.rpm
[root@master1 ~]# rpm -ivh libseccomp-2.5.1-1.el8.x86_64.rpm

[root@master1 ~]# wget https://github.com/containerd/containerd/releases/download/v1.5.10/cri-containerd-cni-1.5.10-linux-amd64.tar.gz
# 如果有限制,也可以替换成下面的 URL 加速下载
# wget https://download.fastgit.org/containerd/containerd/releases/download/v1.5.10/cri-containerd-cni-1.5.10-linux-amd64.tar.gz

[root@master1 ~]# tar -C / -xzf cri-containerd-cni-1.5.10-linux-amd64.tar.gz

然后要将 /usr/local/bin 和 /usr/local/sbin 追加到 ~/.bashrc 文件的 PATH 环境变量中:
export PATH=$PATH:/usr/local/bin:/usr/local/sbin
[root@master1 ~]#source ~/.bashrc

[root@master1 ~]# mkdir -p /etc/containerd
[root@master1 ~]# containerd config default > /etc/containerd/config.toml

对于使用 systemd 作为 init system 的 Linux 的发行版,使用 systemd 作为容器的 cgroup driver 可以确保节点在资源紧张的情况更加稳定,所以推荐将 containerd 的 cgroup driver 配置为 systemd。

修改前面生成的配置文件 /etc/containerd/config.toml,在 plugins.“io.containerd.grpc.v1.cri”.containerd.runtimes.runc.options 配置块下面将 SystemdCgroup 设置为 true:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true
    ....

然后再为镜像仓库配置一个加速器,需要在 cri 配置块下面的 registry 配置块下面进行配置 registry.mirrors:

[plugins."io.containerd.grpc.v1.cri"]
  ...
  # sandbox_image = "k8s.gcr.io/pause:3.5"
  sandbox_image = "registry.aliyuncs.com/k8sxio/pause:3.5"
  ...
  [plugins."io.containerd.grpc.v1.cri".registry]
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
        endpoint = ["https://xxxxxxxxxx.mirror.aliyuncs.com"]
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]
        endpoint = ["https://registry.aliyuncs.com/k8sxio"]

由于上面我们下载的 containerd 压缩包中包含一个 etc/systemd/system/containerd.service 的文件,这样我们就可以通过 systemd 来配置 containerd 作为守护进程运行了,现在我们就可以启动 containerd 了,直接执行下面的命令即可:

[root@master1 ~]# systemctl daemon-reload
[root@master1 ~]# systemctl enable containerd --now
Created symlink from /etc/systemd/system/multi-user.target.wants/containerd.service to /etc/systemd/system/containerd.service.

启动完成后就可以使用 containerd 的本地 CLI 工具 ctr 和 crictl 了,比如查看版本:

[root@master1 ~]# ctr version
Client:
  Version:  v1.5.10
  Revision: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
  Go version: go1.16.14

Server:
  Version:  v1.5.10
  Revision: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
  UUID: dca19e1e-fbef-4b03-ba22-a51bdaab8740
[root@master1 ~]# crictl version
Version:  0.1.0
RuntimeName:  containerd
RuntimeVersion:  v1.5.10
RuntimeApiVersion:  v1alpha2

使用 kubeadm 部署 Kubernetes

上面的相关环境配置也完成了,现在我们就可以来安装 Kubeadm 了,我们这里是通过指定yum 源的方式来进行安装的:

➜  ~ cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF

上面的 yum 源是需要科学上网的,如果不能科学上网的话,我们可以使用阿里云的源进行安装:

cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF

然后安装 kubeadm、kubelet、kubectl:

# --disableexcludes 禁掉除了kubernetes之外的别的仓库
yum makecache fast
yum install -y kubelet-1.22.1 kubeadm-1.22.1 kubectl-1.22.1 --disableexcludes=kubernetes
kubeadm version
kubeadm version: &version.Info{
   Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:44:22Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}

systemctl enable --now kubelet

接下来的操作是master节点进行的操作
初始化集群¶
当我们执行 kubelet --help 命令的时候可以看到原来大部分命令行参数都被 DEPRECATED了,这是因为官方推荐我们使用 --config 来指定配置文件,在配置文件中指定原来这些参数的配置,可以通过官方文档 Set Kubelet parameters via a config file 了解更多相关信息,这样 Kubernetes 就可以支持动态 Kubelet 配置(Dynamic Kubelet Configuration)了,参考 Reconfigure a Node’s Kubelet in a Live Cluster。

然后我们可以通过下面的命令在 master 节点上输出集群初始化默认使用的配置:

kubeadm config print init-defaults --component-configs KubeletConfiguration > kubeadm.yaml

然后根据我们自己的需求修改配置,比如修改 imageRepository 指定集群初始化时拉取 Kubernetes 所需镜像的地址,kube-proxy 的模式为 ipvs,另外需要注意的是我们这里是准备安装 flannel 网络插件的,需要将 networking.podSubnet 设置为10.244.0.0/16:

# kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.23.205  # 指定master节点内网IP
  bindPort: 6443
nodeRegistration:
  criSocket: /run/containerd/containerd.sock  # 使用 containerd的Unix socket 地址
  imagePullPolicy: IfNotPresent
  name: master1   #master节点的主机名
  taints:  # 给master添加污点,master节点不能调度应用
  - effect: "NoSchedule"
    key: "node-role.kubernetes.io/master"

---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs  # kube-proxy 模式

---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {
   }
dns: {
   }
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/k8sxio
kind: ClusterConfiguration
kubernetesVersion: 1.22.1
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16  # 指定 pod 子网
scheduler: {
   }

---
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
cgroupDriver: systemd  # 配置 cgroup driver
logging: {
   }
memorySwap: {
   }
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s

在开始初始化集群之前可以使用kubeadm config images pull --config kubeadm.yaml预先在各个服务器节点上拉取所k8s需要的容器镜像。

配置文件准备好过后,可以使用如下命令先将相关镜像 pull 下面:

[root@master1 ~]# kubeadm config images pull --config kubeadm.yaml
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-apiserver:v1.22.1
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-controller-manager:v1.22.1
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-scheduler:v1.22.1
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-proxy:v1.22.1
[config/images] Pulled registry.aliyuncs.com/k8sxio/pause:3.5
[config/images] Pulled registry.aliyuncs.com/k8sxio/etcd:3.5.0-0
failed to pull image "registry.aliyuncs.com/k8sxio/coredns:v1.8.4": output: time="2022-05-27T01:21:15-04:00" level=found desc = failed to pull and unpack image \"registry.aliyuncs.com/k8sxio/coredns:v1.8.4\": failed to resolve refer8.4\": registry.aliyuncs.com/k8sxio/coredns:v1.8.4: not found"
, error: exit status 1
To see the stack trace of this error execute with --v=5 or higher

拉取 coredns 镜像的时候出错了(其实就是用加速器名称就不对应了),没有找到这个镜像,我们可以手动 pull 该镜像,然后重新 tag 下镜像地址即可:

[root@master1 ~]# ctr -n k8s.io i pull docker.io/coredns/coredns:1.8.4
docker.io/coredns/coredns:1.8.4:                                                  resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:6e5a02c21641597998b4be7cb5eb1e7b02c0d8d23cce4dd09f4682d463798890:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:10683d82b024a58cc248c468c2632f9d1b260500f7cd9bb8e73f751048d7d6d4: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:bc38a22c706b427217bcbd1a7ac7c8873e75efdd0e59d6b9f069b4b243db4b4b:    done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:8d147537fb7d1ac8895da4d55a5e53621949981e2e6460976dae812f83d84a44:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c6568d217a0023041ef9f729e8836b19f863bcdb612bb3a329ebc165539f5a80:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 17.3s                                                                    total:  13.1 M (773.7 KiB/s)      
unpacking linux/amd64 sha256:6e5a02c21641597998b4be7cb5eb1e7b02c0d8d23cce4dd09f4682d463798890...
done: 974.228881ms

[root@master1 ~]# ctr -n k8s.io i tag docker.io/coredns/coredns:1.8.4 registry.aliyuncs.com/k8sxio/coredns:v1.8.4
registry.aliyuncs.com/k8sxio/coredns:v1.8.4

然后就可以使用上面的配置文件在 master 节点上进行初始化:

[root@master1 ~]# kubeadm init --config kubeadm.yaml
[init] Using Kubernetes version: v1.22.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master1] and IPs [10.96.0.1 192.168.23.205]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master1] and IPs [192.168.23.205 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master1] and IPs [192.168.23.205 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 21.011508 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master1 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.23.205:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:e35247f442d800b821d8dcaf041c6095f528824b2a5b3a4115ce6598b36f7d09

根据安装提示拷贝 kubeconfig 文件:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

然后可以使用 kubectl 命令查看 master 节点已经初始化成功了:

[root@master1 ~]# kubectl get node
NAME      STATUS   ROLES                  AGE    VERSION
master1   Ready    control-plane,master   6m2s   v1.22.1

添加节点

记住初始化集群上面的配置和操作要提前做好,将 master 节点上面的 $HOME/.kube/config 文件拷贝到 node 节点对应的文件中,安装 kubeadm、kubelet、kubectl(可选),然后执行上面初始化完成后提示的 join 命令即可:

[root@node1 containerd]# kubeadm join 192.168.23.205:6443 --token abcdef.0123456789abcdef \
>         --discovery-token-ca-cert-hash sha256:e35247f442d800b821d8dcaf041c6095f528824b2a5b3a4115ce6598b36f7d09
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

如果忘记了上面的 join 命令可以使用命令 kubeadm token create --print-join-command 重新获取。

安装网络插件,接下来安装网络插件,可以在文档 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 中选择我们自己的网络插件,这里我们安装 flannel:

➜  ~ wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 如果有节点是多网卡,则需要在资源清单文件中指定内网网卡
# 搜索到名为 kube-flannel-ds 的 DaemonSet,在kube-flannel容器下面
➜  ~ vi kube-flannel.yml
......
containers:
- name: kube-flannel
  image: quay.io/coreos/flannel:v0.14.0   #相关的image换成这个,cni-plugin那个的插件可以不用换,rancher的镜像在这里使用有点问题,如果rancher的cni-plugin也有问题,可以到github或docker上找替代的镜像
  command:
  - /opt/bin/flanneld
  args:
  - --ip-masq
  - --kube-subnet-mgr
  - --iface=ens33  # 如果是多网卡的话,指定内网网卡的名称
......
➜  ~ kubectl apply -f kube-flannel.yml  # 安装 flannel 网络插件

如果有多网卡网络,flannel最好是指定通信网卡,越精确越好,否则不指定它则使用默认路由的网卡通信。
(可以指定多张网卡名称,比如新节点不是ens33,还有就是进去该节点的pod容器中手动修改网卡名称)

当我们部署完网络插件后执行 ifconfig 命令,正常会看到新增的cni0与flannel1这两个虚拟设备,但是如果没有看到cni0这个设备也不用太担心,我们可以观察/var/lib/cni目录是否存在,如果不存在并不是说部署有问题,而是该节点上暂时还没有应用运行,我们只需要在该节点上运行一个 Pod 就可以看到该目录会被创建,并且cni0设备也会被创建出来。

[root@master1 ~]# kubectl get pod -n kube-system
NAME                              READY   STATUS     RESTARTS   AGE
coredns-7568f67dbd-dd2dp          1/1     Running    0          15m
coredns-7568f67dbd-rzfs7          1/1     Running    0          15m
etcd-master1                      1/1     Running    0          15m
kube-apiserver-master1            1/1     Running    0          15m
kube-controller-manager-master1   1/1     Running    0          15m
kube-flannel-ds-259dw             0/1     Init:1/2   0          2m55s
kube-flannel-ds-bmctc             0/1     Init:1/2   0          2m55s
kube-flannel-ds-r4nvw             0/1     Init:1/2   0          2m55s
kube-proxy-cj7zq                  1/1     Running    0          8m6s
kube-proxy-cv9tx                  1/1     Running    0          8m14s
kube-proxy-kn6dk                  1/1     Running    0          15m
kube-scheduler-master1            1/1     Running    0          15m


Dashboard

v1.22.1 版本的集群需要安装最新的 2.0+ 版本的 Dashboard:

# 推荐使用下面这种方式
➜  ~ wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.3.1/aio/deploy/recommended.yaml
➜  ~ vi recommended.yaml
# 修改Service为NodePort类型
......
kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  type: NodePort  # 加上type=NodePort变成NodePort类型的服务
......

在 YAML 文件中可以看到新版本 Dashboard 集成了一个 metrics-scraper 的组件,可以通过 Kubernetes 的 Metrics API 收集一些基础资源的监控信息,并在 web 页面上展示,所以要想在页面上展示监控信息就需要提供 Metrics API,比如安装 Metrics Server。

kubectl apply -f recommended.yaml

kubectl get pods -n kubernetes-dashboard -o wide
NAME                                         READY   STATUS    RESTARTS   AGE   IP          NODE     NOMINATED NODE   READINESS GATES
dashboard-metrics-scraper-856586f554-pllvt   1/1     Running   0          24m   10.88.0.9   master1   <none>           <none>
kubernetes-dashboard-76597d7        1/1     Running   0          21m   10.88.0.2   node2    <none>           <none>

上面的 Pod 分配的 IP 段是 10.88.xx.xx,包括前面自动安装的 CoreDNS 也是如此,我们前面配置的 podSubnet 为 10.244.0.0/16

ls -la /etc/cni/net.d/
total 8
drwxr-xr-x  2 1001 docker  67 Aug 31 16:45 .
drwxr-xr-x. 3 1001 docker  19 Jul 30 01:13 ..
-rw-r--r--  1 1001 docker 604 Jul 30 01:13 10-containerd-net.conflist
-rw-r--r--  1 root root   292 Aug 31 16:45 10-flannel.conflist

一个是 10-containerd-net.conflist,另外一个是我们上面创建的 Flannel 网络插件生成的配置,我们的需求肯定是想使用 Flannel 的这个配置,我们可以查看下 containerd 这个自带的 cni 插件配置:

➜  ~ cat /etc/cni/net.d/10-containerd-net.conflist
{
   
  "cniVersion": "0.4.0",
  "name": "containerd-net",
  "plugins": [
    {
   
      "type": "bridge",
      "bridge": "cni0",
      "isGateway": true,
      "ipMasq": true,
      "promiscMode": true,
      "ipam": {
   
        "type": "host-local",
        "ranges": [
          [{
   
            "subnet": "10.88.0.0/16"
          }],
          [{
   
            "subnet": "2001:4860:4860::/64"
          }]
        ],
        "routes": [
          {
    "dst": "0.0.0.0/0" },
          {
    "dst": "::/0" }
        ]
      }
    },
    {
   
      "type": "portmap",
      "capabilities": {
   "portMappings": true}
    }
  ]
}

可以看到上面的 IP 段恰好就是 10.88.0.0/16,但是这个 cni 插件类型是 bridge 网络,网桥的名称为 cni0:

ip a
...
6: cni0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 9a:e7:eb:40:e8:66 brd ff:ff:ff:ff:ff:ff
    inet 10.88.0.1/16 brd 10.88.255.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 2001:4860:4860::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::98e7:ebff:fe40:e866/64 scope link
       valid_lft forever preferred_lft forever
...

但是使用 bridge 网络的容器无法跨多个宿主机进行通信,跨主机通信需要借助其他的 cni 插件,比如上面我们安装的 Flannel,或者 Calico 等等,由于我们这里有两个 cni 配置,所以我们需要将 10-containerd-net.conflist 这个配置删除,因为如果这个目录中有多个 cni 配置文件,kubelet 将会使用按文件名的字典顺序排列的第一个作为配置文件,所以前面默认选择使用的是 containerd-net 这个插件。

mv /etc/cni/net.d/10-containerd-net.conflist /etc/cni/net.d/10-containerd-net.conflist.bak
ifconfig cni0 down && ip link delete cni0
systemctl daemon-reload
systemctl restart containerd kubelet

重建 coredns 和 dashboard 的 Pod,重建后 Pod 的 IP 地址就正常了,变回我们设置的10.244.0.0网段

➜  ~ kubectl get svc -n kubernetes-dashboard
NAME                        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
dashboard-metrics-scraper   ClusterIP   10.99.37.172    <none>        8000/TCP        25m
kubernetes-dashboard        NodePort    10.103.102.27   <none>        443:31050/TCP   25m

然后可以通过上面的 31050 端口去访问 Dashboard,要记住使用 https,Chrome 不生效可以使用Firefox 测试,点击页面中的信任证书即可:

然后创建一个具有全局所有权限的用户来登录 Dashboard:(admin.yaml)

直接使用cluster-admin

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: admin
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kubernetes-dashboard
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kubernetes-dashboard

kubectl apply -f admin.yaml
kubectl get secret -n kubernetes-dashboard|grep admin-token
admin-token-lwmmx                  kubernetes.io/service-account-token   3         1d
kubectl get secret admin-token-lwmmx -o jsonpath={
   .data.token} -n kubernetes-dashboard |base64 -d
# 会生成一串很长的base64后的字符串

然后用上面的 base64 解码后的字符串作为 token 登录 Dashboard 即可
kubernetes部署(containerd版)

清理

如果你的集群安装过程中遇到了其他问题,我们可以使用下面的命令来进行重置:

kubeadm reset
ifconfig cni0 down && ip link delete cni0
ifconfig flannel.1 down && ip link delete flannel.1
rm -rf /var/lib/cni/
w3cjava