在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:kubeeye开源软件地址:https://gitee.com/kubesphere/kubeeye开源软件介绍:
KubeEyeKubeEye 旨在发现 Kubernetes 上的各种问题,比如应用配置错误(使用Polaris)、集群组件不健康和节点问题(使用Node-Problem-Detector)。除了预定义的规则,它还支持自定义规则。 架构图KubeEye 通过调用Kubernetes API,通过常规匹配日志中的关键错误信息和容器语法的规则匹配来获取集群诊断数据,详见架构。 怎么使用
root@node1:# ke diagNODENAME SEVERITY HEARTBEATTIME REASON MESSAGEnode18 Fatal 2020-11-19T10:32:03+08:00 NodeStatusUnknown Kubelet stopped posting node status.node19 Fatal 2020-11-19T10:31:37+08:00 NodeStatusUnknown Kubelet stopped posting node status.node2 Fatal 2020-11-19T10:31:14+08:00 NodeStatusUnknown Kubelet stopped posting node status.node3 Fatal 2020-11-27T17:36:53+08:00 KubeletNotReady Container runtime not ready: RuntimeReady=false reason:DockerDaemonNotReady message:docker: failed to get docker version: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?NAME SEVERITY TIME MESSAGEscheduler Fatal 2020-11-27T17:09:59+08:00 Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refusedetcd-0 Fatal 2020-11-27T17:56:37+08:00 Get https://192.168.13.8:2379/health: dial tcp 192.168.13.8:2379: connect: connection refusedNAMESPACE SEVERITY PODNAME EVENTTIME REASON MESSAGEdefault Warning node3.164b53d23ea79fc7 2020-11-27T17:37:34+08:00 ContainerGCFailed rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?default Warning node3.164b553ca5740aae 2020-11-27T18:03:31+08:00 FreeDiskSpaceFailed failed to garbage collect required amount of images. Wanted to free 5399374233 bytes, but freed 416077545 bytesdefault Warning nginx-b8ffcf679-q4n9v.16491643e6b68cd7 2020-11-27T17:09:24+08:00 Failed Error: ImagePullBackOffdefault Warning node3.164b5861e041a60e 2020-11-27T19:01:09+08:00 SystemOOM System OOM encountered, victim process: stress, pid: 16713default Warning node3.164b58660f8d4590 2020-11-27T19:01:27+08:00 OOMKilling Out of memory: Kill process 16711 (stress) score 205 or sacrifice child Killed process 16711 (stress), UID 0, total-vm:826516kB, anon-rss:819296kB, file-rss:0kB, shmem-rss:0kBinsights-agent Warning workloads-1606467120.164b519ca8c67416 2020-11-27T16:57:05+08:00 DeadlineExceeded Job was active longer than specified deadlinekube-system Warning calico-node-zvl9t.164b3dc50580845d 2020-11-27T17:09:35+08:00 DNSConfigForming Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 100.64.11.3 114.114.114.114 119.29.29.29kube-system Warning kube-proxy-4bnn7.164b3dc4f4c4125d 2020-11-27T17:09:09+08:00 DNSConfigForming Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 100.64.11.3 114.114.114.114 119.29.29.29kube-system Warning nodelocaldns-2zbhh.164b3dc4f42d358b 2020-11-27T17:09:14+08:00 DNSConfigForming Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 100.64.11.3 114.114.114.114 119.29.29.29NAMESPACE SEVERITY NAME KIND TIME MESSAGEkube-system Warning node-problem-detector DaemonSet 2020-11-27T17:09:59+08:00 [livenessProbeMissing runAsPrivileged]kube-system Warning calico-node DaemonSet 2020-11-27T17:09:59+08:00 [runAsPrivileged cpuLimitsMissing]kube-system Warning nodelocaldns DaemonSet 2020-11-27T17:09:59+08:00 [cpuLimitsMissing runAsPrivileged]default Warning nginx Deployment 2020-11-27T17:09:59+08:00 [cpuLimitsMissing livenessProbeMissing tagNotSpecified]insights-agent Warning workloads CronJob 2020-11-27T17:09:59+08:00 [livenessProbeMissing]insights-agent Warning cronjob-executor Job 2020-11-27T17:09:59+08:00 [livenessProbeMissing]kube-system Warning calico-kube-controllers Deployment 2020-11-27T17:09:59+08:00 [cpuLimitsMissing livenessProbeMissing]kube-system Warning coredns Deployment 2020-11-27T17:09:59+08:00 [cpuLimitsMissing] 您可以参考常见FAQ内容来优化您的集群。 KubeEye 能做什么
检查项
未标注的项目正在开发中 添加自定义检查规则添加 npd 自定义检查规则
kubectl edit cm -n kube-system node-problem-detector-config
自定义最佳实践规则
checks: imageFromUnauthorizedRegistry: warningcustomChecks: imageFromUnauthorizedRegistry: promptMessage: When the corresponding rule does not match. Show that image from an unauthorized registry. category: Images target: Container schema: '$schema': http://json-schema.org/draft-07/schema type: object properties: image: type: string not: pattern: ^quay.io
root:# ke diag -f rule.yaml --kubeconfig ~/.kube/configNAMESPACE SEVERITY NAME KIND TIME MESSAGEdefault Warning nginx Deployment 2020-11-27T17:18:31+08:00 [imageFromUnauthorizedRegistry]kube-system Warning node-problem-detector DaemonSet 2020-11-27T17:18:31+08:00 [livenessProbeMissing runAsPrivileged]kube-system Warning calico-node DaemonSet 2020-11-27T17:18:31+08:00 [cpuLimitsMissing runAsPrivileged]kube-system Warning calico-kube-controllers Deployment 2020-11-27T17:18:31+08:00 [cpuLimitsMissing livenessProbeMissing]kube-system Warning nodelocaldns DaemonSet 2020-11-27T17:18:31+08:00 [runAsPrivileged cpuLimitsMissing]default Warning nginx Deployment 2020-11-27T17:18:31+08:00 [livenessProbeMissing cpuLimitsMissing]kube-system Warning coredns Deployment 2020-11-27T17:18:31+08:00 [cpuLimitsMissing] 文档 |
请发表评论