action #160239
Updated by tinita 7 months ago
1 firing alert instance [IMAGE] 📁 SALT › EXTERNAL HTTP RESPONSES 🔥 1 firing instances Firing [stats.openqa-monitor.qa.suse.de] http://stats.openqa-monitor.qa.suse.de/alerting/grafana/b3a53df8-b7ee-48dd-9325-8a541187737f/view?orgId=1 External http responses View alert [stats.openqa-monitor.qa.suse.de] Summary HTTP endpoint does not properly work Description An HTTP endpoint we need for proper operation delivers an http status code which indicates an issue with the service or its reachability. Values B=500 C=1 Labels alertname External http responses grafana_folder Salt server https://openqa.suse.de/health Looking into the access og, we had 4825 500 Server errors today so far, not only for https://openqa.suse.de/health The errorlog shows many: ``` 2024/05/12 00:06:06 [crit] 2563#2563: accept4() failed (24: Too many open files) ``` The first occurrence I can find was 2024/05/07 12:02:50. For comparison, the number of open files: ``` # o3 lsof | wc -l 18978 # osd lsof | wc -l 35675 ```