k8s全栈监控之metrics-server和prometheus ⼀、概述
使⽤metric-server收集数据给k8s集内使⽤,如kubectl,hpa,scheduler等
使⽤prometheus-operator部署prometheus,存储监控数据
使⽤kube-state-metrics收集k8s集内资源对象数据
使⽤node_exporter收集集中各节点的数据
使⽤prometheus收集apiserver,scheduler,controller-manager,kubelet组件数据
使⽤alertmanager实现监控报警
使⽤grafana实现数据可视化
1、部署metrics-server
git  clone  github/cuishuaigit/k8s-monitor.git
cd  k8s-monitor
我都是把这种服务部署在master节点上⾯,此时需要修改metrics-server-deployment.yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
tolerations:
- effect: NoSchedule
key: node.kubernetes.io/unschedulable
operator: Exists
- key: NoSchedule
operator: Exists
effect: NoSchedule
nodeselector
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: io/metrics-server-amd64:v0.3.1
imagePullPolicy: Always
command:
- /metrics-server
- --kubelet-insecure-tls
-
--kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
volumeMounts:
- name: tmp-dir
mountPath: /tmp
nodeSelector:
metrics: "yes"
为master节点添加label
kubectl label nodes ku  metrics=yes
部署
kubectl create -f metrics-server/deploy/1.8+/
验证:
it's cool
注:metrics-server默认使⽤node的主机名,但是coredns⾥⾯没有物理机主机名的解析,⼀种是部署的时候添加⼀个参数:- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
2、部署prometheus
下载相关⽂件:
前⾯部署metrics-server已经把所有的⽂件pull到本地了,所以直接使⽤
cd k8s-monitor
1.搭建nfs服务动态提供持久化存储
1.安装nfs
sudo apt-get install -y nfs-kernel-server
sudo apt-get install -y nfs-common
sudo vi /etc/exports
/data/opv *(rw,sync,no_root_squash,no_subtree_check)
注意将*换成⾃⼰的ip段,纯内⽹的话也可以⽤*,代替任意
sudo /etc/init.d/rpcbind restart
sudo /etc/init.d/nfs-kernel-server restart
sudo systemctl enable rpcbind nfs-kernel-server
客户端挂在使⽤
sudo apt-get install -y nfs-common
mount -t nfs ku13-1:/data/opv  /data/opv -o proto=tcp -o nolock
为了⽅便使⽤将上⾯的mount命令直接放到.bashrc⾥⾯
2.创建namesapce
kubectl creaet -f nfs/monitoring-namepsace.yaml
3.为nfs创建rbac
kubectl create -f nfs/rbac.yaml
4.创建deployment,将nfs的地址换成⾃⼰的
kubectl create -f nfs/nfs-deployment.yaml
5.创建storageclass
kubectl create -f nfs/storageClass.yaml
2.安装Prometheus
cd k8s-monitor/Promutheus/prometheus
1.创建权限
kubectl create -f rbac.yaml
2.创建 node-exporter
kubectl create -f prometheus-node-exporter-daemonset.yaml
kubectl create -f prometheus-node-exporter-service.yaml
3.创建 kube-state-metrics
kubectl create -f kube-state-metrics-deployment.yaml
kubectl create -f kube-state-metrics-service.yaml
4.创建 node-directory-size-metrics
kubectl create -f node-directory-size-metrics-daemonset.yaml
5.创建 prometheus
kubectl create -f prometheus-pvc.yaml
kubectl create -f prometheus-core-configmap.yaml
kubectl create -f prometheus-core-deployment.yaml
kubectl create -f prometheus-core-service.yaml
kubectl create -f prometheus-rules-configmap.yaml
6.修改core-configmap⾥的etcd地址
3.安装Grafana
cd k8s-monitor/Promutheus/grafana
1.安装grafana service
kubectl create -f grafana-svc.yaml
2.创建configmap
kubectl create -f grafana-configmap.yaml
3.创建pvc
kubectl create -f grafana-pvc.yaml
4.创建gragana deployment
kubectl create -f grafana-deployment.yaml
5.创建dashboard configmap
kubectl create configmap "grafana-import-dashboards" --from-file=dashboards/ --namespace=monitoring
6.创建job,导⼊dashboard等数据
kubectl create -f grafana-job.yaml
查看部署:
prometheus和grafana都是采⽤的nodePort⽅式暴漏的服务,所以可以直接访问。
grafana默认的⽤户名密码:admin/admin
QA:
1、集是使⽤kubeadm部署的,controller-manager和schedule都是监听的127.0.0.1,导致prometheus收集不到相关的数据?可以在初始化之前修改其监听地址:
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
controllerManager:
extraArgs:
address: 0.0.0.0
scheduler:
extraArgs:
address: 0.0.0.0
如果集已经构建好了:
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-controller-manager.yaml
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-scheduler.yaml
2、metrics-server不能使⽤,报错不能解析node节点的主机名?
需要修改deployment⽂件,
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
3、metrics-server报错,x509,证书是⾮信任的?
command:
- /metrics-server
- --kubelet-insecure-tls
4、完整的配置⽂件
containers:
- name: metrics-server
image: io/metrics-server-amd64:v0.3.1
command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。