银河麒麟v11 kubeadm部署k8s v1.35.0高可用集群
系统环境
- # nkvers
- ############## Kylin Linux Version #################
- Release:
- Kylin Linux Advanced Server release V11 (Swan25)
- Kernel:
- 6.6.0-32.7.v2505.ky11.x86_64
- Build:
- Kylin Linux Advanced Server
- release V11 2503/(Swan25)-x86_64-Build20/20250715
- #################################################
复制代码 主机名ip地址配置服务demo-master-01192.168.122.1712c4gk8s master、etcd、keepalived、haproxy、containerddemo-master-02192.168.122.1722c4gk8s master、etcd、keepalived、haproxy、containerddemo-worker-01192.168.122.1732c4gk8s worker、etcd服务版本介绍
服务名称版本kubernetes1.35.0containerd2.2.1etcd3.6.7cni-plugins1.3.0runc1.4.0keepalived2.3.4haproxy3.2.10cilium1.18.6前言
此文档主要用于演示使用二进制kubeadm部署Kubernetes高可用集群。
此文档仅进行流程演示,后续用户可以使用ansible或编写Shell、Go脚本进行自动化配置,提高部署效率。
v1.35.0版本较之前版本还是有些差异和特性的,再加上新发布的国产服务器操作系统银河麒麟v11,相信看完会有新的收获!
环境准备
此步骤主要将一些部署k8s前的前置条件配置好,每个主机都要进行操作。
配置主机名,配置hosts解析
- # 主机名
- hostnamectl set-hostname demo-master-01
- hostnamectl set-hostname demo-master-02
- hostnamectl set-hostname demo-worker-01
- # /etc/hosts
- 192.168.122.171 demo-master-01
- 192.168.122.172 demo-master-02
- 192.168.122.173 demo-worker-01
复制代码 关闭防火墙,SElinux,swap分区
- # 防火墙
- systemctl stop firewalld.service
- systemctl disable firewalld.service
- # SElinux,默认就是关闭状态,若是开启状态可执行以下命令
- setenforce 0
- sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
- # swap分区
- swapoff -a
- sed -ri 's/.*swap.*/#&/' /etc/fstab
复制代码 配置时钟同步
- # 配置时区
- timedatectl set-timezone Asia/Shanghai
- # 配置ntp服务器
- vim /etc/chrony.conf
- ...
- server ntp1.aliyun.com iburst
- ...
- # 启动服务,并开启自启
- systemctl enable --now chronyd.service
- # 检查同步服务、同步状态
- systemctl status chronyd.service
- timedatectl
- chronyc sources -v
复制代码 内核配置
由于银河麒麟V11自带内核已为6.6版本,故不需再进行大版本的升级,仅需进行某些内核参数配置即可。
因为CNI插件准备采用cilium,直接替代kube-proxy,故不需加载ipvs模块,但由于k8s节点与keepalived服务在同一节点,故需加载ipvs模块。
另外银河麒麟的sysctl内核参数加载顺序,可参考前文:银河麒麟v10 sysctl内核参数加载顺序思考。
由于k8s 1.35版本默认的cgroup版本就是v2,同时k8s官方也强烈建议使用cgroup v2,使用cgroup v2有以下几点要求:
- 操作系统发行版启用 CGroup v2
- Linux 内核为 5.8 或更高版本
- 容器运行时支持 CGroup v2。例如:containerd v1.4 和更高版本、cri-o v1.20 和更高版本
- kubelet 和容器运行时被配置为使用 systemd CGroup 驱动
查看当前默认cgroup版本:- stat -fc %T /sys/fs/cgroup/
- tmpfs
复制代码tmpfs表示CGroup v1、cgroup2fs表示CGroup v2
修改内核cmdline参数,重启,使其默认支持CGroup v2:
vim /etc/default/grub- # 最后添加systemd.unified_cgroup_hierarchy=1参数
- GRUB_CMDLINE_LINUX="... systemd.unified_cgroup_hierarchy=1"
复制代码 重新生成GRUB配置文件,并重启服务器:- grub2-mkconfig -o /boot/grub2/grub.cfg
- reboot
复制代码 再次验证cgroup版本:- stat -fc %T /sys/fs/cgroup/
- cgroup2fs
复制代码- # 配置模块
- cat >/etc/modules-load.d/k8s.conf << EOF
- overlay
- br_netfilter
- ip_vs
- ip_vs_rr
- ip_vs_wrr
- ip_vs_sh
- nf_conntrack
- ip_tables
- ip_set
- xt_set
- ipt_set
- ipt_rpfilter
- ipt_REJECT
- ipip
- EOF
- # 将管理服务开机自启,并重新加载生效
- systemctl enable --now systemd-modules-load
- systemctl restart systemd-modules-load
- # 临时加载某模块
- modprobe overlay
- modprobe br_netfilter
复制代码https://localhost:16443/healthz地址端口改为haproxy ip和监听的apiserver端口
由于此时kube-apiserver还未进行部署,6443端口还未被监听,所以现在仅用pgrep haproxy检查进程存活,不检测 HAProxy 后端状态:
demo-master-01下:vim /etc/keepalived/keepalived.conf- # 删除原配置
- sed -i '/net.ipv4.ip_forward/d' /etc/sysctl.conf /etc/sysctl.d/99-sysctl.conf
- # 添加新配置
- cat > /etc/sysctl.d/99-kubernetes-cri.conf << EOF
- # ========== 网络相关(K8S 网络插件/容器通信核心) ==========
- # 让桥接流量经过 iptables 规则(Calico/Flannel/CNI 插件必需)
- net.bridge.bridge-nf-call-iptables = 1
- net.bridge.bridge-nf-call-ip6tables = 1
- # 开启 IP 转发(Pod 跨节点通信、Service 转发必需)
- net.ipv4.ip_forward = 1
- # 复用 TIME_WAIT 套接字,提升高并发网络性能
- net.ipv4.tcp_tw_reuse = 1
- # 增大套接字监听队列上限(避免高并发下连接被拒绝)
- net.core.somaxconn = 65535
- # 增大网络设备接收队列(提升网络吞吐)
- net.core.netdev_max_backlog = 65535
- # 增大 TCP SYN 队列(抵御 SYN 洪水,提升连接建立效率)
- net.ipv4.tcp_max_syn_backlog = 65535
- # 缩短 FIN_WAIT2 超时时间(快速释放资源)
- net.ipv4.tcp_fin_timeout = 30
- # 优化 ARP 缓存(大规模集群减少缓存回收频率)
- net.ipv4.neigh.default.gc_thresh1 = 8192
- net.ipv4.neigh.default.gc_thresh2 = 32768
- net.ipv4.neigh.default.gc_thresh3 = 65536
- # ========== Containerd/容器运行时相关 ==========
- # 增大 inotify 实例数/监控文件数(避免 containerd 监控容器文件时达到上限)
- fs.inotify.max_user_instances = 1048576
- fs.inotify.max_user_watches = 1048576
- # 增大系统最大文件句柄数(支持大规模容器,每个容器占用多个文件句柄)
- fs.file-max = 52706963
- # 增大单个进程最大打开文件数(配合 file-max)
- fs.nr_open = 52706963
- # 增大虚拟内存映射区数量(满足 containerd/容器内应用如 ES 的需求)
- vm.max_map_count = 262144
- # 禁用交换分区(K8S 要求,避免容器内存使用 swap 导致性能下降)
- vm.swappiness = 0
- # 增大最大 PID 数(支持大规模 Pod,每个 Pod 包含多个进程)
- kernel.pid_max = 4194304
- # ========== K8S 核心稳定性相关 ==========
- # 内核 panic 后 10 秒重启(提升集群自愈能力)
- kernel.panic = 10
- # 内核出错时触发 panic(避免节点僵死)
- kernel.panic_on_oops = 1
- # 检测到软锁死时触发 panic(避免节点无响应)
- kernel.softlockup_panic = 1
- EOF
- # 加载新配置
- sysctl --system
- # 检查配置是否生效
- sysctl -a | grep -E 'ip_forward|nr_open'
复制代码 demo-master-02下:vim /etc/keepalived/keepalived.conf- # mkdir kernel_rpm && cd kernel_rpm
- # for i in kernel kernel-core kernel-devel kernel-headers kernel-modules kernel-modules-extra kernel-modules-internal; do wget https://update.cs2c.com.cn/NS/V10/V10SP3-2403/os/adv/lic/updates/x86_64/Packages/${i}-4.19.90-89.18.v2401.ky10.x86_64.rpm ;done
复制代码
- 至此haproxy+keepalived的部署和配置完毕,可以暂时先不启动服务,待K8S控制节点部署完成后再进行启动。
- 启动haproxy和keepalived,验证vip是否监听
- # rpm -ivh kernel-4.19.90-89.18.v2401.ky10.x86_64.rpm kernel-core-4.19.90-89.18.v2401.ky10.x86_64.rpm kernel-modules-4.19.90-89.18.v2401.ky10.x86_64.rpm kernel-modules-extra-4.19.90-89.18.v2401.ky10.x86_64.rpm kernel-modules-internal-4.19.90-89.18.v2401.ky10.x86_64.rpm
- # rpm -Uvh kernel-devel-4.19.90-89.18.v2401.ky10.x86_64.rpm
- # rpm -Uvh kernel-headers-4.19.90-89.18.v2401.ky10.x86_64.rpm
复制代码此时可在主节点看到VIP
部署集群
拉取镜像
- # grub2-set-default kernel-4.19.90-89.18.v2401.ky10.x86_64.rpm
复制代码- # grub2-editenv list
- # reboot
- # nkvers
复制代码 初始化kubeadm配置
- CNI_PLUGINS_VERSION="v1.3.0"
- ARCH="amd64"
- DEST="/opt/cni/bin"
- sudo mkdir -p "$DEST"
- curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_PLUGINS_VERSION}/cni-plugins-linux-${ARCH}-${CNI_PLUGINS_VERSION}.tgz" | sudo tar -C "$DEST" -xz
复制代码 有些需要自定义更改的参数与值:- mkdir -p /opt/cni/bin
- tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.0.tgz
复制代码仅保留了需要注意且更改的字段。
kubeadm创建集群
- wget https://github.com/opencontainers/runc/releases/download/v1.4.0/runc.amd64
- install -m 755 runc.amd64 /usr/local/sbin/runc
复制代码kubeadm init --skip-phases=addon/kube-proxy,跳过kube-proxy组件的安装
- wget https://github.com/containerd/containerd/releases/download/v2.2.1/containerd-2.2.1-linux-amd64.tar.gz
- tar Cxzvf /usr/local containerd-2.2.1-linux-amd64.tar.gz
复制代码
- 根据提示,加入新的控制节点,在demo-master-02执行
- # Copyright The containerd Authors.
- #
- # Licensed under the Apache License, Version 2.0 (the "License");
- # you may not use this file except in compliance with the License.
- # You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
- [Unit]
- Description=containerd container runtime
- Documentation=https://containerd.io
- After=network.target dbus.service
- [Service]
- ExecStartPre=-/sbin/modprobe overlay
- ExecStart=/usr/local/bin/containerd
- Type=notify
- Delegate=yes
- KillMode=process
- Restart=always
- RestartSec=5
- # Having non-zero Limit*s causes performance problems due to accounting overhead
- # in the kernel. We recommend using cgroups to do container-local accounting.
- LimitNPROC=infinity
- LimitCORE=infinity
- # Comment TasksMax if your systemd version does not supports it.
- # Only systemd 226 and above support this version.
- TasksMax=infinity
- OOMScoreAdjust=-999
- [Install]
- WantedBy=multi-user.target
复制代码
- 根据提示,加入新的worker节点,在demo-worker-01执行
- systemctl daemon-reload
- systemctl enable --now containerd
- systemctl status containerd.service
复制代码- DOWNLOAD_DIR="/usr/local/bin"
- CRICTL_VERSION="v1.31.0"
- ARCH="amd64"
- curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-${ARCH}.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz
复制代码未安装CNI插件,集群状态为NotReady
部署CNI插件cilium
cilium是目前最流行的CNI插件之一,其基于eBPF,性能高,且支持网络负载、策略配置、可观测一体化,低耗高效,可替代kube-proxy组件。
部署helm工具
- tar Cxzvf /usr/local/bin crictl-v1.31.0-linux-amd64.tar.gz
复制代码- cat > /etc/crictl.yaml <<EOF
- runtime-endpoint: unix:///run/containerd/containerd.sock
- image-endpoint: unix:///run/containerd/containerd.sock
- timeout: 10
- debug: true
- pull-image-on-create: false
- EOF
复制代码 部署cilium chart
- # 下载
- wget https://github.com/containerd/nerdctl/releases/download/v2.2.1/nerdctl-2.2.1-linux-amd64.tar.gz
- # 解压
- tar xf nerdctl-2.2.1-linux-amd64.tar.gz -C /usr/local/bin/
- # 运行查看
- nerdctl images
复制代码- containerd config default > /etc/containerd/config.toml
- vim /etc/containerd/config.toml
- # 配置 systemd cgroup 驱动
- [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
- ...
- [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
- SystemdCgroup = true
- # 修改 sandbox 容器镜像仓库
- [plugins.'io.containerd.cri.v1.images'.pinned_images]
- sandbox = 'registry.aliyuncs.com/google_containers/pause:3.10.1'
复制代码 修改几个关键变量值:
vim cilium/values.yaml- DOWNLOAD_DIR="/usr/local/bin"
- RELEASE="v1.35.0"
- ARCH="amd64"
- cd $DOWNLOAD_DIR
- sudo curl -L --remote-name-all https://dl.k8s.io/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet}
- sudo chmod +x {kubeadm,kubelet}
- RELEASE_VERSION="v0.16.2"
- curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/krel/templates/latest/kubelet/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /usr/lib/systemd/system/kubelet.service
- sudo mkdir -p /usr/lib/systemd/system/kubelet.service.d
- curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/krel/templates/latest/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
复制代码- chmod +x kubeadm kubelet
- mv kubeadm kubelet /usr/local/bin/
复制代码- [Unit]
- Description=kubelet: The Kubernetes Node Agent
- Documentation=https://kubernetes.io/docs/
- Wants=network-online.target
- After=network-online.target
- [Service]
- ExecStart=/usr/local/bin/kubelet
- Restart=always
- StartLimitInterval=0
- RestartSec=10
- [Install]
- WantedBy=multi-user.target
复制代码kubeadm init --skip-phases=addon/kube-proxy可以在init阶段直接跳过安装kube-proxy。
- # Note: This dropin only works with kubeadm and kubelet v1.11+
- [Service]
- Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
- Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
- # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
- EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
- # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
- # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
- EnvironmentFile=-/etc/sysconfig/kubelet
- ExecStart=
- ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
复制代码- systemctl enable --now kubelet
- systemctl status kubelet
复制代码 安装cilium cli工具
- install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
- kubectl version --client
复制代码 查看cilium集群状态:- yum install bash-completion
- source /usr/share/bash-completion/bash_completion
- # kubectl补全
- source <(kubectl completion bash)
- kubectl completion bash >/etc/bash_completion.d/kubectl
- # crictl补全
- source <(crictl completion bash)
- crictl completion bash >/etc/bash_completion.d/crictl
- # nerctl补全
- source <(nerdctl completion bash)
- nerdctl completion bash >/etc/bash_completion.d/nerdctl
复制代码 开启和管理hubble
- # 使用你的主机 IP 更新 HOST0、HOST1 和 HOST2 的 IP 地址
- export HOST0=192.168.122.171
- export HOST1=192.168.122.172
- export HOST2=192.168.122.173
- # 使用你的主机名更新 NAME0、NAME1 和 NAME2
- export NAME0="demo-master-01"
- export NAME1="demo-master-02"
- export NAME2="demo-worker-01"
- # 创建临时目录来存储将被分发到其它主机上的文件
- mkdir -p /tmp/${HOST0}/ /tmp/${HOST1}/ /tmp/${HOST2}/
- HOSTS=(${HOST0} ${HOST1} ${HOST2})
- NAMES=(${NAME0} ${NAME1} ${NAME2})
- for i in "${!HOSTS[@]}"; do
- HOST=${HOSTS[$i]}
- NAME=${NAMES[$i]}
- cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
- ---
- apiVersion: "kubeadm.k8s.io/v1beta4"
- kind: InitConfiguration
- nodeRegistration:
- name: ${NAME}
- localAPIEndpoint:
- advertiseAddress: ${HOST}
- ---
- apiVersion: "kubeadm.k8s.io/v1beta4"
- kind: ClusterConfiguration
- etcd:
- local:
- serverCertSANs:
- - "${HOST}"
- peerCertSANs:
- - "${HOST}"
- extraArgs:
- - name: initial-cluster
- value: ${NAMES[0]}=https://${HOSTS[0]}:2380,${NAMES[1]}=https://${HOSTS[1]}:2380,${NAMES[2]}=https://${HOSTS[2]}:2380
- - name: initial-cluster-state
- value: new
- - name: name
- value: ${NAME}
- - name: listen-peer-urls
- value: https://${HOST}:2380
- - name: listen-client-urls
- value: https://${HOST}:2379
- - name: advertise-client-urls
- value: https://${HOST}:2379
- - name: initial-advertise-peer-urls
- value: https://${HOST}:2380
- EOF
- done
复制代码也可以在chart value.yaml中修改:hubble.relay.enabled=true、hubble.ui.enabled=true
- kubeadm init phase certs etcd-ca
- # 会生成这两个文件
- ls /etc/kubernetes/pki/etcd
- ca.crt ca.key
复制代码
- 浏览器访问http://192.168.122.171:32000/就可以查看流量转发详情了!
检查k8s集群状态
- # 使用你的主机 IP 更新 HOST0、HOST1 和 HOST2 的 IP 地址
- export HOST0=192.168.122.171
- export HOST1=192.168.122.172
- export HOST2=192.168.122.173
- kubeadm init phase certs etcd-server --config=/tmp/${HOST2}/kubeadmcfg.yaml
- kubeadm init phase certs etcd-peer --config=/tmp/${HOST2}/kubeadmcfg.yaml
- kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
- kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
- cp -R /etc/kubernetes/pki /tmp/${HOST2}/
- # 清理不可重复使用的证书
- find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
- kubeadm init phase certs etcd-server --config=/tmp/${HOST1}/kubeadmcfg.yaml
- kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
- kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
- kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
- cp -R /etc/kubernetes/pki /tmp/${HOST1}/
- find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
- kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
- kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
- kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
- kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
- # 不需要移动 certs 因为它们是给 HOST0 使用的
- # 清理不应从此主机复制的证书
- find /tmp/${HOST2} -name ca.key -type f -delete
- find /tmp/${HOST1} -name ca.key -type f -delete
复制代码 到此,k8s高可用集群已搭建完毕,实际生产环境中,etcd、keepalived、haproxy应尽可能的选择单独的服务器进行部署,且在集群规模较大时,根据实际情况进行优化。
结语
本文以k8s官方的kubeadm工具构建高可用集群,且基本所有服务都是二进制安装的方式,此种方式通用性较强,若您是arm架构机器则需下载对应服务的arm版本的二进制文件即可,步骤几乎相同。目前市面上大多k8s集群自动化部署工具都是采用此种方式安装的。
本文的目的就是带领大家了解在国产的银河麒麟v11下部署高可用k8s的全过程,顺便理解自动化部署k8s工具的底层逻辑,当然,真正的生产环境大家肯定是优先选择自动化部署,银河麒麟v11官方也推出了自动化部署工具,大家可以尝试着使用一下~
后续会针对网络插件cilium进行一波网络走向抓包分析,来深入了解一下其原理,欢迎感兴趣的朋友关注。
最后放一下本次部署所使用到的所有安装包和镜像,点击这里获取。
参考链接:
关于cgroup v2使用与介绍
containerd官方部署手册
kubeadm、kubelet、kubectl官方部署手册
kubeadm集群安装官方引导手册
容器运行时cgroup驱动官方指导手册
cilium官方指导手册
来源:程序园用户自行投稿发布,如果侵权,请联系站长删除
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作! |