Project

General

Profile

Actions

tickets #116620

closed

matrix.infra.o.o running out of disk space

Added by crameleon almost 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
IRC and Matrix
Target version:
-
Start date:
2022-09-15
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Hi,

the root partition on this machine is constantly reaching 100% disk usage (sometimes it goes back to ~95, I assume some cleanup is running on a schedule). Would someone with the respective level of access be so kind to either check if the services running on there could be adjusted to use less disk space (possibly by installing stricter logrotate rules) or to increase the root disk size?

Note that whenever the machine reaches 100%, the Matrix homeserver becomes unresponsive.

Best,
Georg


Files

Actions #1

Updated by crameleon almost 2 years ago

  • Tags set to matrix, vm
  • Tracker changed from communication to tickets
  • Priority changed from Normal to High
  • Private changed from Yes to No
Actions #2

Updated by cboltz almost 2 years ago

  • Category set to IRC and Matrix
  • Assignee set to hellcp
Actions #3

Updated by hellcp almost 2 years ago

It was me cleaning it up by hand. The issue is discord bridge crashing all the time, it requires a rebase on the current master (we have 3 or so custom patches on the server), since that bug was resolved, I simply do not have the time to do that.

Actions #4

Updated by cboltz almost 2 years ago

/var/log/messages gets spammed by synapse - today's log file is 800 MB.

As a hotfix, I moved /var/log/messages-*.xz (44 files, 2.7 GB!) to /var/log/matrix-synapse/ which is on a separate partition. This gives us some free space on the root partition, but as long as /var/log/messages gets spammed, that free space won't last for too long.

Actions #5

Updated by crameleon almost 2 years ago

hellcp wrote:

discord bridge

Can I be of help with this? I tried to check on the machine but can't read most directories. If you find the time, we could have a short chat about what is where to get me started.

Actions #6

Updated by hellcp almost 2 years ago

You honestly just need to know it's /var/lib/matrix-synapse/discord/. Do a git fetch and git rebase of the main branch upstream (worth checking git remote for which remotes are there). Remember to make sure the permissions of the files in the repo are correct. Don't forget to run the npm build command before restarting the discord bridge service with systemctl restart discord.

If you want to get a hold of me to talk about it over voice chat, not until October, or later.

Actions #7

Updated by crameleon almost 2 years ago

  • Assignee changed from hellcp to crameleon

Thanks, Bernhard helped me get the right access. The directory had some untracked changes, I just cloned and built the stable branch. Works fine. Backup tarballs are in /data/backup. Synapse still logs a lot of 404 errors, will change logging to file with rotation on its separate partition after restart on Thursday.

Actions #8

Updated by hellcp almost 2 years ago

I should probably look into that, but I don't think the stable branch includes replies handling and the fixed regex for usernames which caused some errors in the past.

Actions #9

Updated by crameleon almost 2 years ago

The "develop" branch (which seems to be upstream's "master" equivalent) is only three commits ahead of the v3.0.0 release: https://github.com/matrix-org/matrix-appservice-discord/commits/develop. We were running off a local branch which diverged off an upstream commit in July, and the local changes were mostly related to package.json and yarn.lock. Let me know if I missed anything that's relevant to us and I'll change it.

Actions #10

Updated by crameleon almost 2 years ago

  • Status changed from New to Resolved

Given Synapse crashed earlier today, I used the opportunity to change the log handler directly instead of waiting for Thursday. The logs should now no longer clutter the root partition. There's still an increase of disk space usage on /data, we should install a cronjob to delete old/large attachment uploads at some point, but that's out of scope of this ticket.

Actions #11

Updated by hellcp almost 2 years ago

  • File Screenshot from 2022-09-24 20-32-56.png added
  • File Screenshot from 2022-09-24 20-41-55.png added

crameleon wrote:

The "develop" branch (which seems to be upstream's "master" equivalent) is only three commits ahead of the v3.0.0 release: https://github.com/matrix-org/matrix-appservice-discord/commits/develop. We were running off a local branch which diverged off an upstream commit in July, and the local changes were mostly related to package.json and yarn.lock. Let me know if I missed anything that's relevant to us and I'll change it.

Yup, what is missing are the cherry picked commits we had from various other forks. You have only checked git status, when all of that stuff is in git log, because it's commits not untracked changes. This leads to situations like this, where people have no idea what people on discord are responding to, because the reply support isn't in this branch.

Actions #12

Updated by hellcp almost 2 years ago

  • File deleted (Screenshot from 2022-09-24 20-41-55.png)
Actions #13

Updated by hellcp almost 2 years ago

  • File deleted (Screenshot from 2022-09-24 20-32-56.png)
Actions #15

Updated by crameleon almost 2 years ago

Of course I checked git log - that's where I found some helpful commits comments like "fixed bridge". That is in addition to the untracked changes.

The branch name "derberg" I could not find on any of the remote repositories either.

I reverted to the old version. If we use custom patches it would be good to maintain a proper fork on a remote repository.

Actions #16

Updated by hellcp almost 2 years ago

eh, I guess somebody else must have reset the branch before as well, I will go and track the commits we needed from that

Actions #17

Updated by crameleon almost 2 years ago

As discussed in chat, just to track here: running at v3.0.0 with GitHub/matrix-appservice-discord PR patches 704 and 783 now.

Actions

Also available in: Atom PDF