kubernetes - prometheus-blackbox-exporter is Firing false positive alerts

Question

Welcome To Ask or Share your Answers For Others

kubernetes - prometheus-blackbox-exporter is Firing false positive alerts

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

kubernetes - prometheus-blackbox-exporter is Firing false positive alerts

We have set up full Prometheus stack - Prometheus/Grafana/Alertmanager/Node Explorer/Blackbox exporter using community helm charts in our Kubernetes cluster. Monitoring stack is deployed in its own namespace and our main software, comprised of microservices is deployed in the default namespace. Alerting is operating fine however blackbox exporter is not scraping correctly metrics (I guess) and FIRING regularly false positive alerts. We use the last for probing our microservices HTTP liveness/readiness endpoints.

My configuration (in values.yaml) related to the issue looks like:

- alert: InstanceDown
           expr: up == 0
           for: 5m
           annotations:
             title: 'Instance {{ $labels.instance }} down'
             description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
- alert: ExporterIsDown
           expr: up{job="prometheus-blackbox-exporter"} == 0
           for: 5m
           labels:
             severity: warning
           annotations:
             summary: "Blackbox exporter is down"
             description: "Blackbox exporter is down or not being scraped correctly"
...
...
...
extraScrapeConfigs:  |
   - job_name: 'prometheus-blackbox-exporter'
     metrics_path: /probe
     params:
       module: [http_2xx]
     static_configs:
       - targets:
         - http://service1.default.svc.cluster.local:8082/actuator/health/liveness
         - http://service2.default.svc.cluster.local:8081/actuator/health/liveness
         - http://service3.default.svc.cluster.local:8080/actuator/health/liveness
     relabel_configs:
       - source_labels: [__address__]
         target_label: __param_target
       - source_labels: [__param_target]
         target_label: instance
       - target_label: __address__
         replacement: prometheus-blackbox-exporter:9115

These 2 alerts are firing on every hour but at that time endpoints are 100% reachable.

We're using the default prometheus-blackbox-exporter/values.yaml file:

config:
  modules:
    http_2xx:
      prober: http
      timeout: 5s
      http:
        valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
        no_follow_redirects: false
        preferred_ip_protocol: "ip4"

Mails accordingly look this way:

5] Firing
Labels
alertname = InstanceDown
instance = http://service1.default.svc.cluster.local:8082/actuator/health/liveness
job = prometheus-blackbox-exporter
severity = critical

another type of email

Labels
alertname = ExporterIsDown
instance = http://service1.default.svc.cluster.local:8082/actuator/health/liveness
job = prometheus-blackbox-exporter
severity = warning
Annotations
description = Blackbox exporter is down or not being scraped correctly
summary = Blackbox exporter is down

Another odd thing I noticed is that in Prometheus UI I don't see any probe_* metrics as shown here https://lapee79.github.io/en/article/monitoring-http-using-blackbox-exporter/ Not sure what we are doing wrong or missing to do but it's very annoying to get hundreds of false positive emails.

question from:https://stackoverflow.com/questions/65840967/prometheus-blackbox-exporter-is-firing-false-positive-alerts

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

kubernetes - prometheus-blackbox-exporter is Firing false positive alerts

kubernetes - prometheus-blackbox-exporter is Firing false positive alerts

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags