解决k8s出现pod服务⼀直处于ContainerCreating状态的问题的过程
参考于:
根据实际情况稍微做了修改和说明。
在创建Dashborad时,查看状态总是ContainerCreating
[root@MyCentos7 k8s]# kubectl get pod --namespace=kube-system
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-2094756401-kzhnx 0/1 ContainerCreating 0 10m
通过kubectl describe命令查看具体信息(或查看⽇志/var/log/message)
[root@MyCentos7 k8s]# kubectl describe pod kubernetes-dashboard-2094756401-kzhnx --namespace=kube-system
Name: kubernetes-dashboard-2094756401-kzhnx
Namespace: kube-system
Node: mycentos7-1/192.168.126.131
Start Time: Tue, 05 Jun 2018 19:28:25 +0800
Labels: app=kubernetes-dashboard
pod-template-hash=2094756401
Status: Pending
IP:
Controllers: ReplicaSet/kubernetes-dashboard-2094756401
Containers:
kubernetes-dashboard:
Container ID:
Image: daocloud.io/megvii/kubernetes-dashboard-amd64:v1.8.0
Image ID:
Port: 9090/TCP
Args:
--apiserver-host=192.168.126.130:8080
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Liveness: http-get :9090/ delay=30s timeout=30s period=10s #success=1 #failure=3
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
mysql下载后安装中出现提示不到安装包Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
-
-------- -------- ----- ---- ------------- -------- ------ -------
11m 11m 1 {default-scheduler } Normal Scheduled Successfully assigned kubernetes-dashboard-2094756401-kzhnx to mycentos7-1
11m 49s 7 {kubelet mycentos7-1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failede:latest, this may be because there are no credentials on this 11m 11s 47 {kubelet mycentos7-1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"dh
在⼯作节点(node)上执⾏
发现此时会pull⼀个镜像dhat/rhel7/pod-infrastructure:latest,当我⼿动pull时,提⽰如下错误:
[root@MyCentos7 k8s]# docker pull dhat/rhel7/pod-infrastructure:latest
Trying to pull repository dhat/rhel7/pod-infrastructure ...
open /etc/docker/certs.d/: no such file or directory
通过提⽰的路径查该⽂件,是个软连接,链接⽬标是/etc/rhsm,查看没有rhsm
[root@MyCentos7 ca]# cd /etc/docker/certs.d/dhat/
[root@MyCentos7 dhat]# ll
总⽤量 0
lrwxrwxrwx. 1 root root 27 5⽉ 11 14: -> /etc/rhsm/ca/redhat-uep.pem
[root@MyCentos7 ca]# cd /etc/rhsm
-bash: cd: /etc/rhsm: 没有那个⽂件或⽬录
安装rhsm(node上):
yum install *rhsm*
已加载插件:fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirror.lzu.edu
* extras: mirror.lzu.edu
* updates: ftp.sjtu.edu
base | 3.6 kB 00:00:00
extras | 3.4 kB 00:00:00
updates | 3.4 kB 00:00:00
软件包 python-rhsm-1.19.10-1.el7_4.x86_64 被已安装的 subscription-manager-rhsm-1.20.s.x86_64 取代
软件包 subscription-manager-rhsm-1.20.s.x86_64 已安装并且是最新版本
软件包 python-rhsm-certificates-1.19.10-1.el7_4.x86_64 被已安装的 subscription-manager-rhsm-certificates-1.20.s.x86_64 取代
软件包 subscription-manager-rhsm-certificates-1.20.s.x86_64 已安装并且是最新版本
但是在/etc/rhsm/ca/⽬录下依旧没有证书⽂件,于是反复卸载与安装都不靠谱,后来发现⼤家所谓yum install *rhsm*其实安装的的是python-rhsm-1.19.10-1.el7_4.x86_64和python-rhsm-certificates-1.19.10-
1.el7_4.x86_64,但是在实际安装过程中会有如下提⽰:
软件包 python-rhsm-1.19.10-1.el7_4.x86_64 被已安装的 subscription-manager-rhsm-1.20.s.x86_64 取代
软件包 subscription-manager-rhsm-1.20.s.x86_64 已安装并且是最新版本
软件包 python-rhsm-certificates-1.19.10-1.el7_4.x86_64 被已安装的 subscription-manager-rhsm-certificates-1.20.s.x86_64 取代
软件包 subscription-manager-rhsm-certificates-1.20.s.x86_64 已安装并且是最新版本
罪魁祸⾸在这⾥。原来我们想要安装的rpm包被取代了。⽽取代后的rpm包在安装完成后之创建了⽬录,并没有证书⽂件redhat-uep.pem。于是乎,⼿动下载以上两个包
wget ftp://ftp.icm.edu.pl/vol/rzm6/linux-scientificlinux/7.4/x86_64/os/Packages/python-rhsm-certificates-1.19.9-1.el7.x86_64.rpm
wget ftp://ftp.icm.edu.pl/vol/rzm6/linux-scientificlinux/7.4/x86_64/os/Packages/python-rhsm-1.19.9-1.el7.x86_64.rpm
注:在此处有时会报错,提⽰不到这两个rpm⽂件,此时需要⼿动登录到此FTP进⾏下载,⽂件要稍等会才会加载出来,然后下载所需的这两个rpm(可能是⽹络原
因,有时不稳定)
注意版本要匹配,卸载安装错的包
yum remove *rhsm*
然后执⾏安装命令
rpm -ivh *.rpm
rpm -ivh *.rpm
警告:python-rhsm-1.19.9-1.el7.x86_64.rpm: 头V4 DSA/SHA1 Signature, 密钥 ID 192a7d7d: NOKEY
准备中... >>>>>>### [100%]
正在升级/安装...
1:python-rhsm-certificates-1.19.9-1>>>>>>### [ 50%]
2:python-rhsm-1.19.9-1.el7 >>>>>>### [100%]
我在这⼀步有出错了
[root@neal dhat]# rpm -ivh *.rpm
警告:python-rhsm-1.19.9-1.el7.x86_64.rpm: 头V4 DSA/SHA1 Signature, 密钥 ID 192a7d7d: NOKEY
错误:依赖检测失败:
python-rhsm <= 1.20.3-1 被 (已安裝) subscription-manager-rhsm-1.20.s.x86_64 取代
python-rhsm-certificates <= 1.20.3-1 被 (已安裝) subscription-manager-rhsm-certificates-1.20.s.x86_64 取代
此时跳到分割线之下,⽤分割线下⾯的⽂章的⽅法remove掉已经有的包,再重新⽤上⾯的命令安装。
接着验证⼿动pull镜像
docker pull dhat/rhel7/pod-infrastructure:latest
Trying to pull repository dhat/rhel7/pod-infrastructure ...
latest: Pulling from dhat/rhel7/pod-infrastructure
26e5ed6899db: Pull complete
66dbe984a319: Pull complete
9138e7863e08: Pull complete
Digest: sha256:92d43c37297da3ab187fc2b9e9ebfb243c1110d446c783ae1b989088495db931
Status: Downloaded newer image for dhat/rhel7/pod-infrastructure:latest
问题解决。
--------------------------------------------------------------------------------------------------------------------------------
在《kubernetes权威指南》⼊门的⼀个例⼦中,发现pod⼀直处于ContainerCreating的状态,⽤kubectl describe pod mysql的时候发现如下报错:
1.
Events:
2.
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
3.
--------- -------- ----- ---- ------------- -------- ------ -------
4.
1h 24m 17 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for
dhat/rhel7/pod-infrastructure:latest, this may be because there are no credenti
als on this request. details: (open
/etc/docker/certs.d/: no such file or directory)"
5.
1h 19m 291 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"dhat/rhel7/pod-infrastructure:latest\""
6.
15m 15m 1 {kubelet 127.0.0.1} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
7.
15m 15m 1 {kubelet 127.0.0.1} ainers{mysql} Normal Pulling pulling image "mysql"
8.
7m 7m 1 {kubelet 127.0.0.1} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
9.
7m 7m 1 {kubelet 127.0.0.1} ainers{mysql} Normal Pulling pulling image "mysql"
问题是⽐较明显的,就是没有/etc/docker/certs.d/⽂件,⽤ls -l查看之后发现是⼀个软链接,链接到/etc/rhsm/ca/redhat-uep.pem,但是这个⽂件不存在,使⽤yum search *rhsm*命令:
安装python-rhsm-certificates包:
# yum install python-rhsm-certificates -y
这⾥⼜出现问题了:
python-rhsm-certificates <= 1.20.3-1 被 (已安裝) subscription-manager-rhsm-certificates-1.20.s.x86_64 取代
那么怎么办呢,我们直接卸载掉subscription-manager-rhsm-certificates包,使⽤yum remove subscription-manager-rhsm-certificates -y命令,然后下载python-rhsm-certificates包:
# wget /centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
然后⼿动安装该rpm包:
# rpm -ivh python-rhsm-certificates
这时发现/etc/rhsm/ca/redhat-uep.pem⽂件已存在。
使⽤docker pull dhat/rhel7/pod-infrastructure:latest命令下载镜像,但是可能会很慢,可以到dashboard.daocloud.io⽹站上注册账号,然后点击加速器,然后复制代码执⾏,之后重启docker就会进⾏加速,如果重启docker服务的时候⽆法启动,使⽤systemctl status docker:
1.
# systemctl status docker
2.
● docker.service - Docker Application Container Engine
3.
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
4.
Active: failed (Result: exit-code) since ⼀ 2018-05-28 22:13:37 CST; 13s ago
5.
Docs: docs.docker
6.
Process: 79849 ExecStart=/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --updriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-
profile=/etc/docker/seccomp.json $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY $REGISTRIES (code=exited, status=1/FAILURE)
7.
Main PID: 79849 (code=exited, status=1/FAILURE)
8.
5⽉ 28 22:13:ample systemd[1]: Starting Docker Application
9.
5⽉ 28 22:13:ample dockerd-current[79849]: unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character '}' string 10.
5⽉ 28 22:13:ample systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
11.
5⽉ 28 22:13:ample systemd[1]: Failed to start Docker Application Container Engine.
12.
5⽉ 28 22:13:ample systemd[1]: Unit docker.service entered failed state.
13.
5⽉ 28 22:13:ample systemd[1]: docker.service failed.
14.
Hint: Some lines were ellipsized, use -l to show in full
这时将/etc/docker/seccomp.json删除,再次重启即可
这时将之前创建的rc、svc和pod全部删除重新创建,过⼀会就会发现pod启动成功
原因猜想:根据报错信息,pod启动需要dhat/rhel7/pod-infrastructure:latest镜像,需要去红帽仓库⾥下载,但是没有证书,安装证书之后就可以了
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
推荐文章
热门文章
-
随机森林特征选择原理
2024-10-02 -
自动驾驶系统中的随机森林算法解析
2024-10-02 -
随机森林算法及其在生物信息学中的应用
2024-10-02 -
监督学习中的随机森林算法解析(六)
2024-10-02 -
随机森林算法在数据分析中的应用
2024-10-02 -
机器学习——随机森林,RandomForestClassifier参数含义详解
2024-10-02 -
随机森林 的算法
2024-10-02 -
随机森林算法作用
2024-10-02 -
监督学习中的随机森林算法解析(十)
2024-10-02 -
随机森林算法案例
2024-10-02 -
随机森林案例
2024-10-02 -
二分类问题常用的模型
2024-10-02 -
绘制ssd框架训练流程
2024-10-02 -
一种基于信息熵和DTW的多维时间序列相似性度量算法
2024-10-02 -
SVM训练过程范文
2024-10-02 -
如何使用支持向量机进行股票预测与交易分析
2024-10-02 -
二分类交叉熵损失函数binary
2024-10-02 -
tinybert_训练中文文本分类模型_概述说明
2024-10-02 -
基于门控可形变卷积和分层Transformer的图像修复模型及其应用
2024-10-02 -
人工智能开发技术的测试和评估方法
2024-10-02
最新文章
-
基于随机森林的数据分类算法改进
2024-10-02 -
人工智能中的智能识别与分类技术
2024-10-02 -
基于人工智能技术的随机森林算法在医疗数据挖掘中的应用
2024-10-02 -
随机森林回归模型的建模步骤
2024-10-02 -
r语言随机森林预测模型校准曲线
2024-10-02 -
《2024年随机森林算法优化研究》范文
2024-10-02
发表评论