Project

General

Profile

Actions

tickets #95756

open

factory mailing list archives incomplete

Added by boombatower almost 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Mailing lists
Target version:
-
Start date:
2021-07-21
Due date:
% Done:

0%

Estimated time:


Related issues 1 (0 open1 closed)

Related to openSUSE admin - tickets #103911: Mailing list archives broken?Resolved2021-12-13

Actions
Actions #1

Updated by pjessen almost 3 years ago

  • Category set to Mailing lists
  • Private changed from Yes to No
Actions #2

Updated by pjessen almost 3 years ago

  • Subject changed from mailing list archives incomplete to factory mailing list archives incomplete

I can confirm, the factory list mbox archive for July 2021 stops at around 14 July.
Looking at e.g. https://lists.opensuse.org/archives/list/users@lists.opensuse.org/export/users@lists.opensuse.org-2021-08.mbox.gz?start=2021-07-01&end=2021-08-01 or https://lists.opensuse.org/archives/list/heroes@lists.opensuse.org/export/heroes@lists.opensuse.org-2021-08.mbox.gz?start=2021-07-01&end=2021-08-01, they both seem to be complete. Maybe this only affects the factory list? I expect those mbox.gz exports are created on-demand, so that is probably the place to start looking.

Actions #3

Updated by blackbrook over 2 years ago

I would like to emphasize that what is going on is non-deterministic which should be a big clue as to the cause. If you run that curl command multiple times and look at the size of the resultant .gz file (in bytes), you will see that it is different each time. Some ranges may produce a file that looks complete (and may be usable), but there is still usually a slight variation in the size in bytes, which suggests to me different amounts of whitespace are just being harmlessly truncated in those cases that appear successful.

Also, it is not the gz that is being truncated but the .mbox file going into the archive.

Actions #4

Updated by hellcp over 2 years ago

Wow, sorry for inaction on this, I had no idea this ticket existed until I looked at snapshot review site to see why I wasn't getting notified of new releases. I will bring this up with upstream and see what can be done to resolve this

Actions #5

Updated by hellcp over 2 years ago

It seems that for archive that Jimmy brought up specifically, 14th of July always fails to load, so we may have too large limit on how big mails can be to be accepted on the mailing list.

As to why this currently happens, it's likely caused by the current timeout value in uwsgi. We could increase that, since archives like this are quite large and take a while to process. We also need to optimize hyperkitty in general, because it times out on much smaller pages, but that's an issue that's already reported in another ticket and needs to be addressed there.

Would it be possible to maybe use smaller increments of the archives instead of full month? Instead of trying to load https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/export/factory@lists.opensuse.org-2021-07.mbox.gz?start=2021-07-01&end=2021-08-01, maybe try with https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/export/factory@lists.opensuse.org-2021-07.mbox.gz?start=2021-07-01&end=2021-07-15 and https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/export/factory@lists.opensuse.org-2021-07.mbox.gz?start=2021-07-15&end=2021-08-01 and combine the results, while we work on a more permanent solution.

Actions #6

Updated by pjessen over 2 years ago

Actions

Also available in: Atom PDF