tickets #181646
opencode.o.o (Pagure) causes DoS against id.o.o (Ipsilon)
0%
Description
This is just from the last 22.5 hours:
ldap-proxy (idp proxy server):~ # grep -c 2a07:de40:b27e:1206::a /var/log/apache2/error_log
1098630
This causes session cache on Ipsilon to fill up so rapidly with session and lock files that the disk runs full before Ipsilon cleans up.
Updated by crameleon about 1 month ago
- Category set to Pagure
- Assignee set to Pharaoh_Atem
- Priority changed from Normal to High
- Private changed from Yes to No
Updated by crameleon about 1 month ago · Edited
I now blocked pagure01.i.o.o from reaching id.o.o.
Please mitigate this situation and let me know to remove the ban again.
Updated by crameleon about 1 month ago · Edited
I marked code.o.o as down on status.o.o as it seems to serve 502 as a result.
Edit: seems it does work occasionally, just login not, so I changed it to partial outage.
Updated by crameleon about 1 month ago
This also does not help: https://pagure.io/ipsilon/issue/262. ;-)
Updated by crameleon about 1 month ago
- Related to tickets #181751: login to code.o.o not possible added
Updated by sfalken@cloverleaf-linux.org 28 days ago
I'm not a hero, and don't have the right access to look, but can somebody hit me with some logs, so I can have a look and see if I can't fix this issue?
Updated by crameleon 28 days ago
If you let me know what you're interested in, sure. On pagure01 I find it still trying to query id.o.o constantly:
May 07 02:47:05 pagure01 gunicorn[4049]: 2025-05-07 02:47:05,959 [WARNING] pagure.ui.flask_fas_openid: Error fetching XRDS document: Remote end closed connection without response
pagure01 (pagure):~ # journalctl -t gunicorn -g XRDS -S '1d ago' --no-pager |wc -l
277045
(that warning is just because it's no longer allowed to reach it)
There's nothing that seems relevant leading up to the message in the journal.
In /var/log/pagure/ I find a access_web.log seems to be web server access logs and a error_web.log which shows this over and over again:
[2025-05-07 15:56:16 +0000] [923] [CRITICAL] WORKER TIMEOUT (pid:19664)
[2025-05-07 15:56:16 +0000] [923] [CRITICAL] WORKER TIMEOUT (pid:19847)
[2025-05-07 15:56:17 +0000] [19943] [INFO] Booting worker with pid: 19943
[2025-05-07 15:56:17 +0000] [19946] [INFO] Booting worker with pid: 19946
pagure01 (pagure):~ # grep -c '^\[2025-05-07.*TIMEOUT' /var/log/pagure/error_web.log
811
Seems somewhat broken too but probably not related.
Updated by crameleon 22 days ago
- Related to tickets #182324: State of pagure01.i.o.o / code.o.o added
Updated by Pharaoh_Atem 20 days ago
I am looking into it, I am just not sure yet what's going on.
Updated by crameleon 6 days ago
Considering
- there still not having been any maintainer solution in either Pagure or Ipsilon
- a community member having shown interest in hosting a Forgejo and syncing with Fedora on the migration tooling in a reasonable time frame
and me being interested in making the situation usable in the meanwhile, I now went through a myriad of hacks, and eventually made Ipsilon serve a static XRDS file instead of generating one through Ipsilon.
I will have to monitor if this keeps the load tame enough to not cause an outage but it means that login is possible again for now.