k3s
  看同事写的k3s的总结,还以为看错了,不应该是k8s么。查了下,还真的是有k3s的。
我对k8s也是了解⼀些主要组件的作⽤,更多的是⽤kubectl定位问题。⼲技术的就是不仅⼯作上的知识要学习,也要扩展知识⾯的宽度。
  k3s是rancher®出品的⼀个简化、轻量的k8s,本篇博客记录k3s的安装及踩的部分坑。
  从名字上也能看出,k3s⽐k8s少了些东西,详情可见其官⽹,本地试验可参考
k3s官⽹
安装步骤
准备⼯作
  ⾸先去下载主可执⾏⽂件k3s、离线安装包k3s-airgap-images-amd64.tar和
  我⽤的是v1.18.6+k3s1版本,其于2020年7⽉16⽇发布。
  增加可执⾏⽂件和脚本的可执⾏权限
wget get.k3s.io -O install-k3s.sh
chmod +x install-k3s.sh
  需要有/usr/local/bin/k3s,可考虑软连接
sudo ln -s /home/dev/program/k3s /usr/local/bin/k3s
  复制tar⽂件到/var/lib/rancher/k3s/agent/images
sudo mkdir -p /var/lib/rancher/k3s/agent/images
sudo cp k3s-airgap-images-amd64.tar /var/lib/rancher/k3s/agent/images
定制⼀些变量
  先设置变量如下:
export INSTALL_K3S_SKIP_DOWNLOAD=true
export INSTALL_K3S_EXEC="--docker --write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666"
  逐个解释⼀下:
1. INSTALL_K3S_SKIP_DOWNLOAD=true效果为不去下载k3s可执⾏⽂件
2. INSTALL_K3S_EXEC="(略)"效果为启动k3s服务时使⽤的额外参数
3. --docker效果为使⽤docker⽽不是默认的containerd
4. --write-kubeconfig-mode 666效果为将配置⽂件权限改为⾮所有者也可读可写,进⽽使kubectl命令⽆需root或sudo
5. --write-kubeconfig ~/.kube/config效果为将配置⽂件写到k8s默认会⽤的位置,⽽不是k3s默认的位
置/etc/rancher/k3s/k3s.yaml。后者会导致istio、helm需要额外设置或⽆法运⾏。中还有其他可⽤的选项
执⾏安装脚本
$ ./install-k3s.sh
[INFO]  Skipping k3s download and verify
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/v
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s
  执⾏k3s命令看看效果
$ k3s
NAME:
k3s - Kubernetes, but small and simple
USAGE:
k3s [global options] command [command options] []
VERSION:
v0.9.1 (755bd1c6)
COMMANDS:
server  Run management server
agent    Run node agent
kubectl  Run kubectl
crictl  Run crictl
ctr      Run ctr
help, h  Shows a list of commands or help for one command
GLOBAL OPTIONS:
--debug        Turn on debug logs
-
-help, -h    show help
--version, -v  print the version
  还有k3s kubectl和kubectl
$ k3s kubectl get all --all-namespaces
NAMESPACE    NAME                            READY  STATUS    RESTARTS  AGE
kube-system  pod/coredns-66f496764-mkwjv      1/1    Running  0          5m9s
kube-system  pod/helm-install-traefik-t4xlj  1/1    Running  0          5m8s
NAMESPACE    NAME                TYPE        CLUSTER-IP  EXTERNAL-IP  PORT(S)                  AGE
kube-system  service/kube-dns    ClusterIP  10.43.0.10  <none>        53/UDP,53/TCP,9153/TCP  5m27s
default      service/kubernetes  ClusterIP  10.43.0.1    <none>        443/TCP                  5m25s
NAMESPACE    NAME                      READY  UP-TO-DATE  AVAILABLE  AGE
kube-system  deployment.apps/coredns  1/1    1            1          5m27s
NAMESPACE    NAME                                DESIRED  CURRENT  READY  AGE
kube-system  replicaset.apps/coredns-66f496764  1        1        1      5m9s
NAMESPACE    NAME                            COMPLETIONS  DURATION  AGE
kube-system  job.batch/helm-install-traefik  0/1          5m8s      5m25s
访问kubernetes服务
  由于k3s默认没有提供dashboard作为web ui,先访问k8s的rest
NAMESPACE    NAME                TYPE        CLUSTER-IP  EXTERNAL-IP  PORT(S)                  AGE
default      service/kubernetes  ClusterIP  10.43.0.1    <none>        443/TCP
  会要求输⼊⽤户名密码,在~/.kube/config中有访问其的⽤户名密码,内容类似如下:
users:
- name: default
user:
password: ec2fb0ab4401d7f2525d480fd08e908d
username: admin
  ⽂件位置默认为/etc/rancher/k3s/k3s.yaml,但是前述步骤中通过--write-kubeconfig ~/.kube/config修改
  认证似乎是www basic(对k8s还没了解到这种程度,此处存疑)
  也可kubectl version或随便kubectl run测试⼀下
若⼲问题
如何卸载
  见install.sh的回显,其中有uninstall-script:
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
离线安装包
  如果没有复制k3s-airgap-images-amd64.tar,会卡着
$ k3s kubectl get all --all-namespaces
NAMESPACE    NAME                            READY  STATUS              RESTARTS  AGE
kube-system  pod/helm-install-traefik-t4xlj  0/1    ContainerCreating  0          4m42s
kube-system  pod/coredns-66f496764-mkwjv      0/1    ContainerCreating  0          4m43s
NAMESPACE    NAME                TYPE        CLUSTER-IP  EXTERNAL-IP  PORT(S)                  AGE
kube-system  service/kube-dns    ClusterIP  10.43.0.10  <none>        53/UDP,53/TCP,9153/TCP  5m1s
default      service/kubernetes  ClusterIP  10.43.0.1    <none>        443/TCP                  4m59s
NAMESPACE    NAME                      READY  UP-TO-DATE  AVAILABLE  AGE
kube-system  deployment.apps/coredns  0/1    1            0          5m1s
NAMESPACE    NAME                                DESIRED  CURRENT  READY  AGE
kube-system  replicaset.apps/coredns-66f496764  1        1        0      4m43s
NAMESPACE    NAME                            COMPLETIONS  DURATION  AGE
kube-system  job.batch/helm-install-traefik  0/1          4m42s      4m59s
复制后,安装过程继续
拉不下镜像
  可能因为啥拉镜像失败,可通过kubectl describe pod coredns-57d8bbb86-mndrr -n kube-system查看events:
Events:
Type    Reason                  Age              From                      Message
----    ------                  ----              ----                      -------
Warning  FailedScheduling        <unknown>        default-scheduler        0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
Normal  Scheduled              <unknown>        default-scheduler        Successfully assigned kube-system/coredns-57d8bbb86-mndrr to dk-aspire-5943g
Warning  FailedCreatePodSandBox  3s (x4 over 89s)  kubelet, dk-aspire-5943g  Failed create pod s
andbox: rpc error: code = Unknown desc = failed pulling image "io/pause:3.1": Error response from daemon: Get k8s   可见是由于拉不下镜像io/pause:3.1,于是从阿⾥云拉下镜像,再tag
$ docker pull registry-hangzhou.aliyuncs/google_containers/pause:3.1
3.1: Pulling from google_containers/pause
cf9202429979: Pull complete
Digest: sha256:759c3f0f6493093a9043cc813092290af69029699ade0e3dbe024e968fcb7cca
Status: Downloaded newer image for registry-hangzhou.aliyuncs/google_containers/pause:3.1
registry-hangzhou.aliyuncs/google_containers/pause:3.1
$ docker images
REPOSITORY                                                  TAG                IMAGE ID            CREATED            SIZE
registry-hangzhou.aliyuncs/google_containers/pause  3.1                da86e6ba6ca1        22 months ago      742kB
$ docker tag io/pause:3.1
$ docker images
REPOSITORY                                                  TAG                IMAGE ID            CREATED            SIZE
registry-hangzhou.aliyuncs/google_containers/pause  3.1                da86e6ba6ca1        22 months ago      742kB
kubectl需要root权限
  前已述及,在安装前设置若⼲变量,其中有针对这个问题的
$ kubectl get all
WARN[2019-10-20T22:58:52.068331383+08:00] Unable to read /etc/rancher/k3s/k3s.yaml, please start server with --write-kubeconfig-mode to modify kube config permissions error: Error loading config file "/etc/rancher/k3s/k3s.yaml": open /etc/rancher/k3s/k3s.yaml: permission denied
  /etc/rancher/k3s/k3s.yaml的默认权限为-rw-------即600,所有者root root
  根据提⽰,在启动时需要带有--write-kubeconfig-mode *新权限*,经试验,666可以起到让kubectl⽆需root权限的效果
  此外,v1.17.0+k3s.1的⽂档中提到⼀个选项:
--rootless                                (experimental) Run rootless
  但是试验不成功,service k3s启动失败
  定制环境变量如下:
export INSTALL_K3S_SKIP_DOWNLOAD=true
export INSTALL_K3S_EXEC="--docker --write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666"
  启动失败⽇志⽚段如下:
$ ./install-k3s.sh
(略)
[INFO]  systemd: Starting k3s
Job for k3s.service failed because the control process exited with error code.
See "systemctl status k3s.service" and "journalctl -xe" for details.
$ journalctl -xe
(略)
Jan 09 15:31:40 dk-mi13 k3s[4490]: time="2020-01-09T15:31:40.488024565+08:00" level=fatal msg="resolving : determining current user: $HOME is not defined"
Jan 09 15:31:40 dk-mi13 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: www.ubuntu/support
--
-- An ExecStart= process belonging to unit k3s.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 1.
Jan 09 15:31:40 dk-mi13 systemd[1]: k3s.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: www.ubuntu/support
--
-- The unit k3s.service has entered the 'failed' state with result 'exit-code'.
Jan 09 15:31:40 dk-mi13 systemd[1]: Failed to start Lightweight Kubernetes.
-- Subject: A start job for unit k3s.service has failed
-- Defined-By: systemd
-
- Support: www.ubuntu/support
--
-- A start job for unit k3s.service has finished with a failure.
--
-- The job identifier is 16669 and the job result is failed.
//TODO
  这个问题暂未解决,报错中似乎⽐较关键的
level=fatal msg="resolving : determining current user: $HOME is not defined"
  也有问题,⽆论是⾃⼰的账户还是su⽤root账户,查看$HOME变量均可得到值
$ echo $HOME
/home/dk
$ env | grep HOME
HOME=/home/dk
(略)
$ su
Password:
# echo $HOME
/root
# env | grep HOME
HOME=/root
(略)
KUBECONFIG位置
  配置⽂件默认位置给其他处带来⼀些不便,例如使⽤helm需要如下额外参数以指定配置⽂件的位置
--kubeconfig /etc/rancher/k3s/k3s.yaml
exited
  想改为更为通⽤的~/.kube/config,使⽤参数--write-kubeconfig ~/.kube/config
  此外,在v1.17.0+k3s.1版本中,使⽤kubectl -v 6可见其对配置⽂件的处理:
$ kubectl get all -v 6
I0109 11:27:11.815808  :375] Config loaded from file:  /etc/rancher/k3s/k3s.yaml
  依然读取/etc/rancher/k3s/k3s.yaml,但这个⽂件实际上链接到了~/.kube/config:
$ ll /etc/rancher/k3s/k3s.yaml
lrwxrwxrwx 1 root root 21 Jan  9 15:48 /etc/rancher/k3s/k3s.yaml -> /home/dk/.kube/config
$ ll ~/.kube/config
-rw-rw-rw- 1 root root 1052 Jan  9 15:48 /home/dk/.kube/config
kubectl get all 耗时长
  ⽤v1.17.0+k3s.1执⾏kubectl get all耗时较长(v1.18.6+k3s1中问题依旧),但是kubectl get pod等查看⼀种资源的命令耗时并不较长,增加-v 6查看更详细⽇志:
$ kubectl get all -v 6
(略)
I0109 11:27:11.824426  20876 :443] GET 127.0.0.1:6443/api?timeout=32s 200 OK in 8 milliseconds
I0109 11:27:11.824977  20876 :443] GET 127.0.0.1:6443/apis?timeout=32s 200 OK in 0 milliseconds
I0109 11:27:11.825346  20876 :130] failed to write cache to /home/dk/.kube/cache/discovery/127.0.0.1_6443/servergroups.json due to mkdir /home/dk/.kube/cache: permission denied
I0109 11:27:11.828528  20876 :443] GET 127.0.0.1:6443/api/v1?timeout=32s 200 OK in 2 milliseconds
I0109 11:27:11.829574  20876 :87] failed to write cache to /home/dk/.kube/cache/discovery/127.0.0.1_6443/v1/serverresources.json due to mkdir /home/dk/.kube/cache: permission denied (略)
  可知,原因是向~/.kube/cache⽂件夹下写时⽆权限,处理⼤量错误耗费了时间。默认⽆此⽂件夹,上层.kube⽂件夹所有者root root,权限755
$ ll ~ | grep .kube
drwxr-xr-x  2 root root  4096 Jan  9 11:30  .kube/
  若使⽤sudo kubectl get all没有此耗时问题。
  修正⽅法,将此⽂件夹权限改为其他⽤户可写;或者新建cache和http-cache两⽂件夹,并更改所有者为当前⽤户。后⼀种⽅法例:
$ sudo mkdir cache http-cache
$ sudo chown dk:dk cache http-cache
  ⾄此,解决了kubectl get all等命令耗时太长问题
作者:dracula337435
链接:www.jianshu/p/dbc8d9a8374e
来源:简书
著作权归作者所有。商业转载请联系作者获得授权,⾮商业转载请注明出处。

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。