Page MenuHomePhabricator

Create PostgreSQL monitors
Closed, ResolvedPublic

Description

We can use https://github.com/cloudnative-pg/charts/tree/cloudnative-pg-v0.21.4/charts/cluster/prometheus_rules as a basis to create alerts based on the prometheus metrics we scrape from each cluster.

Each monitor would alert by cluster name, to avoid having to re-create them every time we create a new PG cluster.

Event Timeline

Gehel triaged this task as High priority.Aug 14 2024, 8:32 AM

Change #1067338 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/alerts@master] cloudnative-pg: add monitors for PG clusters

https://gerrit.wikimedia.org/r/1067338

Change #1067340 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] cloudnative-pg: enable ingress traffic to the prometheus port

https://gerrit.wikimedia.org/r/1067340

Change #1067340 merged by Brouberol:

[operations/deployment-charts@master] cloudnative-pg: enable ingress traffic to the prometheus port

https://gerrit.wikimedia.org/r/1067340

Change #1067352 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] Upgrade airflow to 2.10.0

https://gerrit.wikimedia.org/r/1067352

Change #1067352 abandoned by Brouberol:

[operations/deployment-charts@master] Upgrade airflow to 2.10.0

Reason:

This change has already gone through apparently

https://gerrit.wikimedia.org/r/1067352

Change #1067338 merged by Brouberol:

[operations/alerts@master] cloudnative-pg: add monitors for PG clusters

https://gerrit.wikimedia.org/r/1067338

Change #1080251 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/alerts@master] cloudnative_pg: scope the WAL lag alert on the primary instance

https://gerrit.wikimedia.org/r/1080251

Change #1080251 merged by Brouberol:

[operations/alerts@master] cloudnative_pg: scope the WAL lag alert on the primary instance

https://gerrit.wikimedia.org/r/1080251

Change #1080293 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/alerts@master] cloudnative_pg: add a cookbook to investigate wal archive issues

https://gerrit.wikimedia.org/r/1080293

Change #1080293 merged by Brouberol:

[operations/alerts@master] cloudnative_pg: add a cookbook to investigate wal archive issues

https://gerrit.wikimedia.org/r/1080293