Project

General

Profile

tickets #95756

factory mailing list archives incomplete

Added by boombatower 10 months ago. Updated 6 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Mailing lists
Target version:
-
Start date:
2021-07-21
Due date:
% Done:

0%

Estimated time:


Related issues

Related to openSUSE admin - tickets #103911: Mailing list archives broken?Resolved2021-12-13

History

#1 Updated by pjessen 10 months ago

  • Category set to Mailing lists
  • Private changed from Yes to No

#2 Updated by pjessen 10 months ago

  • Subject changed from mailing list archives incomplete to factory mailing list archives incomplete

I can confirm, the factory list mbox archive for July 2021 stops at around 14 July.
Looking at e.g. https://lists.opensuse.org/archives/list/users@lists.opensuse.org/export/users@lists.opensuse.org-2021-08.mbox.gz?start=2021-07-01&end=2021-08-01 or https://lists.opensuse.org/archives/list/heroes@lists.opensuse.org/export/heroes@lists.opensuse.org-2021-08.mbox.gz?start=2021-07-01&end=2021-08-01, they both seem to be complete. Maybe this only affects the factory list? I expect those mbox.gz exports are created on-demand, so that is probably the place to start looking.

#3 Updated by blackbrook 6 months ago

I would like to emphasize that what is going on is non-deterministic which should be a big clue as to the cause. If you run that curl command multiple times and look at the size of the resultant .gz file (in bytes), you will see that it is different each time. Some ranges may produce a file that looks complete (and may be usable), but there is still usually a slight variation in the size in bytes, which suggests to me different amounts of whitespace are just being harmlessly truncated in those cases that appear successful.

Also, it is not the gz that is being truncated but the .mbox file going into the archive.

#4 Updated by hellcp 6 months ago

Wow, sorry for inaction on this, I had no idea this ticket existed until I looked at snapshot review site to see why I wasn't getting notified of new releases. I will bring this up with upstream and see what can be done to resolve this

#5 Updated by hellcp 6 months ago

It seems that for archive that Jimmy brought up specifically, 14th of July always fails to load, so we may have too large limit on how big mails can be to be accepted on the mailing list.

As to why this currently happens, it's likely caused by the current timeout value in uwsgi. We could increase that, since archives like this are quite large and take a while to process. We also need to optimize hyperkitty in general, because it times out on much smaller pages, but that's an issue that's already reported in another ticket and needs to be addressed there.

Would it be possible to maybe use smaller increments of the archives instead of full month? Instead of trying to load https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/export/factory@lists.opensuse.org-2021-07.mbox.gz?start=2021-07-01&end=2021-08-01, maybe try with https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/export/factory@lists.opensuse.org-2021-07.mbox.gz?start=2021-07-01&end=2021-07-15 and https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/export/factory@lists.opensuse.org-2021-07.mbox.gz?start=2021-07-15&end=2021-08-01 and combine the results, while we work on a more permanent solution.

#6 Updated by pjessen 5 months ago

Also available in: Atom PDF