Project

General

Profile

Actions

tickets #137843

closed

Hotfixes on code.o.o

Added by cboltz about 1 year ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Pagure
Target version:
-
Start date:
2023-10-12
Due date:
% Done:

0%

Estimated time:

Description

In the last hour, I added two hotfixes on code.o.o which should be replaced by something more sane ;-)

High load caused by Bytespider / bytedance bot

The Bytespyder bot made about 40% of all HTTP requests, resulting in 4 gunicorn processes, each of them eats 99% CPU - and gateway errors for many users.

As a hotfix, I extended /etc/nginx/vhosts/code.opensuse.org.conf with

 if ($http_user_agent = "Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)"){
     return 444;
 }

I'd hope that a robots.txt also does the job.

Postfix DNS failures

The Pagure timeouts and errors caused the sending of quite some mails to root, but these could not be delivered because

(Host or domain name not found. Name service error for name=relay.infra.opensuse.org type=AAAA: Host not found, try again)

so for some reason it tried to reach relay.i.o.o over IPv6, but it only has a v4 address.

Forcing Postfix to only use IPv4 resulted in

(Host or domain name not found. Name service error for name=relay.infra.opensuse.org type=A: Host not found, try again)

so there must be something wrong with the DNS config.

As a hotfix, I added relayhost=[192.168.47.4] at the end of main.cf to completely avoid the DNS lookups, and the queued mails got delivered.

Needless to say that hardcoding the IP is a bad idea, so we'll need to find out why Postfix fails to do DNS lookups.

Doing manual DNS queries with host or dig work, so this must be something specific to Postfix.

Actions #1

Updated by cboltz about 1 year ago

  • Private changed from Yes to No
Actions #2

Updated by cboltz about 1 year ago

  • Description updated (diff)
Actions #3

Updated by crameleon about 1 year ago

It didn't help with this issue, but upon investigating I noticed we used an MX record pointing to a CNAME for infra.opensuse.org, which violates RFC2181: https://datatracker.ietf.org/doc/html/rfc2181#section-10.3.

I corrected this now:

-infra.opensuse.org 3600 IN MX 42 relay.infra.opensuse.org
+infra.opensuse.org 3600 IN MX 42 proxy.infra.opensuse.org
Actions #4

Updated by crameleon about 1 year ago

  • Category set to Git(lab|hub)
  • Assignee set to Pharaoh_Atem
Actions #5

Updated by crameleon about 1 year ago

Hi @Pharaoh_Atem,

please update this ticket.

Actions #6

Updated by crameleon about 1 year ago

  • Category changed from Git(lab|hub) to Pagure
Actions #7

Updated by Pharaoh_Atem about 1 year ago

I'm not sure what you want me to say here, I don't know much about those issues.

Actions #8

Updated by Pharaoh_Atem about 1 year ago

The load caused by the bytedance bot, do you think we should be doing something different to solve it?

Actions #9

Updated by crameleon 7 months ago

  • Status changed from New to Resolved

Not an issue recently, robots.txt added to help with crawlers which respect it.

Actions

Also available in: Atom PDF