tickets #96503
closedfedaration of matrix chat.opensuse.org is lagging
20%
Description
Incoming messages e.g. sent from a matrix.org to a room on opensuse.org have a long delay until they can be read with an opensuse.org account.
Updated by jzerebecki over 3 years ago
Same problem is mentioned in https://progress.opensuse.org/issues/93366 .
Updated by jzerebecki over 3 years ago
There are some Prometheus metrics for federation which could give a lead: https://github.com/matrix-org/synapse/blob/684d19a11c3b93c9dd5fb90f43d38aa7e8c6005f/synapse/storage/databases/main/event_federation.py#L37
Updated by jzerebecki over 3 years ago
I tested with darix from another matrix server, that when a message from another server than matrix.org comes in then the messages from matrix.org that were delayed show immediately on chat.opensuse.org. We were using the room #test-federation:opensuse.org to test that.
Updated by cboltz over 3 years ago
For the records: I updated matrix to the latest packages about an hour ago:
# zypper in element-web matrix-synapse
...
The following 7 packages are going to be upgraded:
element-web 1.7.32-lp153.2.1 -> 1.8.1-lp153.2.1 noarch openSUSE:infrastructure:matrix obs://build.opensuse.org/openSUSE
matrix-synapse 1.36.0-lp153.185.2 -> 1.40.0-lp153.191.2 noarch openSUSE:infrastructure:matrix obs://build.opensuse.org/openSUSE
python3-Jinja2 2.11.3-lp153.2.7 -> 3.0.1-lp153.2.3 x86_64 devel:languages:python:backports obs://build.opensuse.org/devel:languages:python
python3-Pillow 8.2.0-lp153.2.1 -> 8.3.1-lp153.3.2 x86_64 devel:languages:python:backports obs://build.opensuse.org/devel:languages:python
python3-Twisted 21.2.0-lp153.3.1 -> 21.7.0-lp153.3.1 x86_64 devel:languages:python:backports obs://build.opensuse.org/devel:languages:python
python3-pysaml2 6.5.1-lp153.4.1 -> 7.0.1-lp153.2.3 x86_64 devel:languages:python:backports obs://build.opensuse.org/devel:languages:python
python3-typing_extensions 3.7.4.3-lp153.5.13 -> 3.10.0.0-lp153.2.2 noarch devel:languages:python:backports obs://build.opensuse.org/devel:languages:python
Updated by jzerebecki over 3 years ago
The update didn't help with this specific problem, but was still a good idea.
I sent a pull request to enable the metrics which might give a hint for this problem: https://github.com/openSUSE/heroes-salt/compare/production...JanZerebecki:matrix-enable-metrics
Updated by jzerebecki over 3 years ago
Bernhard took a look and found a few errors messages.
The most related one is:
Aug 27 11:43:51 matrix synapse[27658]: synapse.handlers.federation: [_process_incoming_pdus_in_room_inner-13-$REDACTED1] Error attempting to resolve state at missing prev_events
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation.py", line 387, in on_receive_pdu
origin, room_id, p
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation.py", line 732, in _get_state_after_missing_prev_event
destination=destination, room_id=room_id, events=missing_events
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation.py", line 1379, in _get_events_and_persist
event_infos,
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation.py", line 2323, in _auth_and_persist_events
backfilled=backfilled,
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation.py", line 3320, in persist_events_and_notify
event_and_contexts, backfilled=backfilled
File "/usr/lib/python3.6/site-packages/synapse/storage/persist_events.py", line 322, in persist_events
ret_vals = await yieldable_gather_results(enqueue, partitioned.items())
File "/usr/lib/python3.6/site-packages/synapse/storage/persist_events.py", line 240, in handle_queue_loop
item.events_and_contexts, item.backfilled
File "/usr/lib/python3.6/site-packages/synapse/storage/persist_events.py", line 582, in _persist_event_batch
backfilled=backfilled,
File "/usr/lib/python3.6/site-packages/synapse/storage/databases/main/events.py", line 180, in _persist_events_and_state_updates
new_forward_extremeties=new_forward_extremeties,
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 694, in runInteraction
**kwargs,
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 792, in runWithConnection
self._db_pool.runWithConnection(inner_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/twisted/python/threadpool.py", line 238, in inContext
result = inContext.theWork() # type: ignore[attr-defined]
File "/usr/lib/python3.6/site-packages/twisted/python/threadpool.py", line 255, in <lambda>
ctx, func, *args, **kw
File "/usr/lib/python3.6/site-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python3.6/site-packages/twisted/python/context.py", line 83, in callWithContext
return func(*args, **kw)
File "/usr/lib/python3.6/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
compat.reraise(excValue, excTraceback)
File "/usr/lib/python3.6/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
return function(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/twisted/python/compat.py", line 404, in reraise
raise exception.with_traceback(traceback)
File "/usr/lib/python3.6/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
result = func(conn, *args, **kw)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 786, in inner_func
return func(db_conn, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 554, in new_transaction
r = func(cursor, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/logging/utils.py", line 69, in wrapped
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/storage/databases/main/events.py", line 378, in _persist_events_txn
self._store_event_txn(txn, events_and_contexts=events_and_contexts)
File "/usr/lib/python3.6/site-packages/synapse/storage/databases/main/events.py", line 1350, in _store_event_txn
for event, _ in events_and_contexts
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 929, in simple_insert_many_txn
txn.execute_batch(sql, vals)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 279, in execute_batch
self._do_execute(lambda *x: execute_batch(self.txn, *x), sql, args)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 331, in _do_execute
return func(sql, *args)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 279, in <lambda>
self._do_execute(lambda *x: execute_batch(self.txn, *x), sql, args)
File "/usr/lib64/python3.6/site-packages/psycopg2/extras.py", line 1209, in execute_batch
cur.execute(b";".join(sqls))
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "event_json_event_id_key"
DETAIL: Key (event_id)=($Redacted2) already exists.
The next error looks like https://github.com/matrix-org/synapse/issues/8094 which reads to me as if it is not related.
Aug 27 11:30:04 matrix synapse[27658]: synapse.http.server: [PUT-250] Failed handle request via 'RoomSendEventRestServlet': <XForwardedForRequest at 0x7f26fd0fe6d8 method='PUT' uri='/_matrix/client/r0/rooms/!REDACTED3/send/m.room.message/REDACTED4?user_id=REDACTED5' clientproto='HTTP/1.1' site='8008'>
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/twisted/internet/defer.py", line 1661, in _inlineCallbacks
result = current_context.run(gen.send, result)
File "/usr/lib/python3.6/site-packages/twisted/internet/defer.py", line 63, in run
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/rest/client/v1/room.py", line 252, in on_POST
requester, event_dict, txn_id=txn_id
File "/usr/lib/python3.6/site-packages/synapse/handlers/message.py", line 854, in create_and_send_nonmember_event
depth=depth,
File "/usr/lib/python3.6/site-packages/synapse/handlers/message.py", line 611, in create_event
depth=depth,
File "/usr/lib/python3.6/site-packages/synapse/util/metrics.py", line 91, in measured_func
r = await func(self, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/handlers/message.py", line 928, in create_new_client_event
), "Attempting to create an event with no prev_events"
AssertionError: Attempting to create an event with no prev_events
The apparmor profile probably restricts something that synapse needs, but I don't know if that it related to the lag problem:
File "/usr/lib64/python3.6/tempfile.py", line 298, in gettempdir
tempdir = _get_default_tempdir()
File "/usr/lib64/python3.6/tempfile.py", line 233, in _get_default_tempdir
dirlist)
builtins.FileNotFoundError: [Errno 2] No usable temporary directory found in ['/tmp', '/var/tmp', '/usr/tmp', '/data/var/lib/matrix-synapse']
Updated by bmwiedemann over 3 years ago
- % Done changed from 0 to 20
- Private changed from Yes to No
Updated by jzerebecki over 3 years ago
This should fix the duplicate key error: https://github.com/matrix-org/synapse/pull/10719
Updated by jzerebecki over 3 years ago
Updated by jzerebecki over 3 years ago
After synapse is running with the previous change, it gets further but there is another constraint violation:
Sep 15 15:16:14 matrix synapse[1992]: synapse.handlers.federation_event: [_process_incoming_pdus_in_room_inner-REDACTED1] Error attempting to resolve state at missing prev_events
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation_event.py", line 712, in _resolve_state_at_missing_prevs
dest, room_id, p
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation_event.py", line 819, in _get_state_after_missing_prev_event
destination=destination, room_id=room_id, events=missing_events
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation_event.py", line 1155, in _get_events_and_persist
event_infos,
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation_event.py", line 1198, in _auth_and_persist_events
for ev_info, context in zip(event_infos, contexts)
File "/usr/lib/python3.6/site-packages/synapse/handlers/federation_event.py", line 1743, in persist_events_and_notify
event_and_contexts, backfilled=backfilled
File "/usr/lib/python3.6/site-packages/synapse/storage/persist_events.py", line 322, in persist_events
ret_vals = await yieldable_gather_results(enqueue, partitioned.items())
File "/usr/lib/python3.6/site-packages/synapse/storage/persist_events.py", line 240, in handle_queue_loop
item.events_and_contexts, item.backfilled
File "/usr/lib/python3.6/site-packages/synapse/storage/persist_events.py", line 582, in _persist_event_batch
backfilled=backfilled,
File "/usr/lib/python3.6/site-packages/synapse/storage/databases/main/events.py", line 180, in _persist_events_and_state_updates
new_forward_extremeties=new_forward_extremeties,
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 694, in runInteraction
**kwargs,
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 792, in runWithConnection
self._db_pool.runWithConnection(inner_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/twisted/python/threadpool.py", line 238, in inContext
result = inContext.theWork() # type: ignore[attr-defined]
File "/usr/lib/python3.6/site-packages/twisted/python/threadpool.py", line 255, in <lambda>
ctx, func, *args, **kw
File "/usr/lib/python3.6/site-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python3.6/site-packages/twisted/python/context.py", line 83, in callWithContext
return func(*args, **kw)
File "/usr/lib/python3.6/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
compat.reraise(excValue, excTraceback)
File "/usr/lib/python3.6/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
return function(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/twisted/python/compat.py", line 404, in reraise
raise exception.with_traceback(traceback)
File "/usr/lib/python3.6/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
result = func(conn, *args, **kw)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 786, in inner_func
return func(db_conn, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 554, in new_transaction
r = func(cursor, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/logging/utils.py", line 69, in wrapped
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/synapse/storage/databases/main/events.py", line 378, in _persist_events_txn
self._store_event_txn(txn, events_and_contexts=events_and_contexts)
File "/usr/lib/python3.6/site-packages/synapse/storage/databases/main/events.py", line 1415, in _store_event_txn
txn, table="state_events", values=state_values
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 929, in simple_insert_many_txn
txn.execute_batch(sql, vals)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 279, in execute_batch
self._do_execute(lambda *x: execute_batch(self.txn, *x), sql, args)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 331, in _do_execute
return func(sql, *args)
File "/usr/lib/python3.6/site-packages/synapse/storage/database.py", line 279, in <lambda>
self._do_execute(lambda *x: execute_batch(self.txn, *x), sql, args)
File "/usr/lib64/python3.6/site-packages/psycopg2/extras.py", line 1209, in execute_batch
cur.execute(b";".join(sqls))
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "state_events_event_id_key"
DETAIL: Key (event_id)=(REDACTED2) already exists.
Updated by crameleon 5 months ago
Hi,
closing, as this seems to have been resolved in the meanwhile.
We also have visualization for various Synapse metrics at https://monitor.opensuse.org/grafana/d/000000012/matrix-synapse.
Best,
Georg