tickets #93621
closed
mirrordb - how can table 'filearr' have duplicate rows?
Added by pjessen almost 3 years ago.
Updated almost 3 years ago.
Category:
Core services and virtual infrastructure
Description
I was working on #92930 which kept complaining about duplicate rows in 'filearr'.
Given that filearr has a unique index on 'path', it seems this ought to be impossible.
I deleted some 5000+ duplicate rows, but they keep coming back. It is screwing up the mirror scanning, see e.g. #93477 and #93255.
Yesterday whilst working on #93477, I deleted 1686 duplicates (from 'filearr' and 'hash').
Obviously something was changed somewhere, but what?
- Private changed from Yes to No
I was just looking at the logs on mirrordb1, '/var/lib/pgsql/data13/log/postgresql*zst', and I see the first 'duplicate key' error appearing on 2021-05-20 at 21:19:13.
pjessen wrote:
I was just looking at the logs on mirrordb1, '/var/lib/pgsql/data13/log/postgresql*zst', and I see the first 'duplicate key' error appearing on 2021-05-20 at 21:19:13.
Judging by the zypper logs, mirrordb1 was dup'ed to Leap 15.3 on 2021-05-20 at 21:08:49. I think that answers my question above 'Obviously something was changed somewhere, but what?'.
- Category set to Core services and virtual infrastructure
after we switched to running on mirrordb2, I took a look at the logs - since this morning at 01:46 UTC, I see 46 cases of
ERROR: duplicate key value violates unique constraint "filearr_path_key"
Three cases of:
ERROR: duplicate key value violates unique constraint "device_uniqueness"
I have tried to run this a couple of times, but it keeps timing out:
select path,array_agg(id) from filearr group by path having count(*)>1;
2021-06-15 05:36:38.358 UTC [6796]: [4-1] db=mb_opensuse2,user=mb ERROR: duplicate key value violates unique constraint "filearr_path_key"
2021-06-15 05:36:38.358 UTC [6796]: [5-1] db=mb_opensuse2,user=mb DETAIL: Key (path)=(tumbleweed/repo/oss/x86_64/libmlt++-devel-6.26.1-1.1.x86_64.rpm) already exists.
2021-06-15 05:36:38.358 UTC [6796]: [6-1] db=mb_opensuse2,user=mb CONTEXT: SQL statement "UPDATE filearr
SET mirrors = arr WHERE id = arg_fileid"
PL/pgSQL function mirr_del_byid(integer,integer) line 21 at SQL statement
2021-06-15 05:36:38.358 UTC [6796]: [7-1] db=mb_opensuse2,user=mb STATEMENT: SELECT COUNT(mirr_del_byid(556, id) order by id) FROM temp1
I am sort of taking notes as I go - I am not very familiar with postgresql, so I might well forget something.
On both db1 and db2, table 'filearr' has some indexes names 'something_ccnew':
mb_opensuse2=# \d filearr
Table "public.filearr"
Column | Type | Collation | Nullable | Default
----------+------------------------+-----------+----------+-------------------------------------
id | integer | | not null | nextval('filearr_id_seq'::regclass)
path | character varying(512) | | not null |
mirrors | smallint[] | | |
dirname | character varying(512) | | |
filename | character varying(512) | | |
Indexes:
"filearr_pkey" PRIMARY KEY, btree (id)
"filearr_path_key" UNIQUE CONSTRAINT, btree (path)
"filearr_path_key_ccnew" UNIQUE, btree (path) INVALID
"filearr_pkey_ccnew" UNIQUE, btree (id) INVALID
"filearr_dirname_btree" btree (dirname)
"filearr_dirname_btree_ccnew" btree (dirname)
According to the manual, https://www.postgresql.org/docs/13/sql-reindex.html :
If the index marked INVALID is suffixed ccnew, then it corresponds to the transient index created during the concurrent operation, and the recommended recovery method is to drop it using DROP INDEX, then attempt REINDEX CONCURRENTLY again.
Brief update - so far no new duplicates!
pjessen wrote:
Brief update - so far no new duplicates!
approx 24 hours later, still no new dupes!
- Assignee set to pjessen
- % Done changed from 0 to 100
Still no new dupes. I'm closing this, the downgrade to pg12 has clearly solved the issue.
- Status changed from New to Resolved
Also available in: Atom
PDF