Wait For
Kubernetes clusters and similar dynamic systems may experience temporary discrepancies between the actual and intended state of resources.
For example, a deployment could momentarily appear unhealthy during a scaling operation.
If alerts are configured for config.unhealthy
events, these transient state fluctuations might lead to an overwhelming number of unnecessary notifications.
To address this issue, you can utilize the waitFor parameter. This feature allows you to define a delay before sending notifications for specific events. After an event occurs, the system rechecks its status following the specified wait period. Only if the undesired state persists does a notification trigger.
waitFor
is only applicable to notifications of health related events
notify-unhealthy-deployments.yamlapiVersion: mission-control.flanksource.com/v1
kind: Notification
metadata:
name: deployment-unhealthy-alerts
spec:
events:
- config.unhealthy
waitFor: 2m
filter: config.type == 'Kubernetes::Deployment'
to:
email: alerts@acme.com
waitFor
re-evaluates the health based on the current state in config-db.
However, in some circumstances, there may be a delay between when a change occurs and when it's refelected in config-db,
potentially resulting in false positives.
waitForEvalPeriod
forces an incremental scrape of the resource before sending a notification.
It waits for up to this period for a scrape to complete before sending a notification.
apiVersion: mission-control.flanksource.com/v1
kind: Notification
metadata:
name: deployment-unhealthy-alerts
spec:
events:
- config.unhealthy
waitFor: 5m
waitForEvalPeriod: 30s