Project

General

Profile

Actions

tickets #90209

closed

mirrorbrain makehashes - column filearr.dirname does not exist

Added by pjessen about 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Mirrors
Target version:
-
Start date:
2021-03-17
Due date:
2021-05-28
% Done:

100%

Estimated time:

Description

Since 14 October 2020, the daily mirrorbrain makehash job has been throwing up:

Traceback (most recent call last):
  File "/usr/bin/mb", line 1729, in <module>
    r = mirrordoctor.main()
  File "/usr/lib/python2.7/site-packages/cmdln.py", line 261, in main
    return self.cmd(args)
  File "/usr/lib/python2.7/site-packages/cmdln.py", line 284, in cmd
    retval = self.onecmd(argv)
  File "/usr/lib/python2.7/site-packages/cmdln.py", line 422, in onecmd
    return self._dispatch_cmd(handler, argv)
  File "/usr/lib/python2.7/site-packages/cmdln.py", line 1123, in _dispatch_cmd
    return handler(argv[0], opts, *args)
  File "/usr/bin/mb", line 1056, in do_makehashes
    for i, j in mb.files.dir_filelist(self.conn, dst_dir_db)]
  File "/usr/lib64/python2.7/site-packages/mb/files.py", line 162, in dir_filelist
    result = conn.Server._connection.queryAll(query)
  File "/usr/lib/python2.7/site-packages/sqlobject/dbconnection.py", line 449, in queryAll
    return self._runWithConnection(self._queryAll, s)
  File "/usr/lib/python2.7/site-packages/sqlobject/dbconnection.py", line 342, in _runWithConnection
    val = meth(conn, *args)
  File "/usr/lib/python2.7/site-packages/sqlobject/dbconnection.py", line 441, in _queryAll
    self._executeRetry(conn, c, s)
  File "/usr/lib/python2.7/site-packages/sqlobject/postgres/pgconnection.py", line 257, in _executeRetry
    raise dberrors.ProgrammingError(msg)
sqlobject.dberrors.ProgrammingError: column filearr.dirname does not exist
LINE 5:                WHERE filearr.dirname = '\\162\\145\\160\\157...
Actions #1

Updated by pjessen about 3 years ago

  • Private changed from Yes to No
Actions #2

Updated by pjessen about 3 years ago

mb_opensuse2=# \d filearr
                                    Table "public.filearr"
 Column  |          Type          | Collation | Nullable |               Default               
---------+------------------------+-----------+----------+-------------------------------------
 id      | integer                |           | not null | nextval('filearr_id_seq'::regclass)
 path    | character varying(512) |           | not null | 
 mirrors | smallint[]             |           |          | 

So the table format was changed, but the code was not updated to match? (or the other way around)

Actions #3

Updated by pjessen about 3 years ago

/usr/lib64/python2.7/site-packages/mb/files.py, line 162:

def dir_filelist(conn, path):
    """Returns tuples of (id, name) for all files that reside in a directory

    The returned filenames include their path."""

    query = """SELECT filearr.path, hash.file_id
                   FROM filearr 
               LEFT JOIN hash 
                   ON hash.file_id = filearr.id 
               WHERE filearr.dirname = '%s/'""" % util.pgsql_regexp_esc(path)

Currently installed mirrorbrain is 'python-mb-2.19.3-lp152.12.1.x86_64'.

Actions #4

Updated by pjessen about 3 years ago

  • Status changed from New to Feedback
  • Assignee set to pjessen

For the time being, I have just edited files.py, line 160 - changed "filearr.dirname" to "filearr.path".

Andrii, I see you have been doing some work on mirrorbrain, I was just wondering if this is a left-over from some of that?

Actions #5

Updated by andriinikitin about 3 years ago

pjessen wrote:

Andrii, I see you have been doing some work on mirrorbrain, I was just wondering if this is a left-over from some of that?

It looks this patch wasn't executed , which should be part of 2.19.3
https://github.com/openSUSE/mirrorbrain/blob/2.19.3/sql/migrations/schema-postgresql-add-filearr_split_path_trigger.sql

I didn't do anything in production, except some troubleshooting in May iirc

Actions #6

Updated by pjessen about 3 years ago

  • Status changed from Feedback to Workable

andriinikitin wrote:

pjessen wrote:

Andrii, I see you have been doing some work on mirrorbrain, I was just wondering if this is a left-over from some of that?

It looks this patch wasn't executed , which should be part of 2.19.3
https://github.com/openSUSE/mirrorbrain/blob/2.19.3/sql/migrations/schema-postgresql-add-filearr_split_path_trigger.sql

I didn't do anything in production, except some troubleshooting in May iirc

Okay, thanks for your feedback.

Actions #7

Updated by pjessen about 3 years ago

  • Status changed from Workable to Feedback

andriinikitin wrote:

pjessen wrote:

Andrii, I see you have been doing some work on mirrorbrain, I was just wondering if this is a left-over from some of that?

It looks this patch wasn't executed , which should be part of 2.19.3
https://github.com/openSUSE/mirrorbrain/blob/2.19.3/sql/migrations/schema-postgresql-add-filearr_split_path_trigger.sql

Before I go and apply it, any reason it may have been deliberately left out? Darix, Andrii?

Actions #8

Updated by andriinikitin about 3 years ago

pjessen wrote:

Before I go and apply it, any reason it may have been deliberately left out? Darix, Andrii?

It should be safe to apply. I don't think that it is part of installer script, so probably it should be applied manually and just wasn't during update or something similar.

Actions #9

Updated by pjessen about 3 years ago

I ran the updates, but the final "UPDATE filearr SET id=id where dirname is null" to trigger updates of all the filename/dirnames got killed:

ERROR:  deadlock detected
DETAIL:  Process 23433 waits for ShareLock on transaction 2358030461; blocked by process 31890.
Process 31890 waits for ShareLock on transaction 2357996271; blocked by process 23433.
HINT:  See server log for query details.
CONTEXT:  while locking tuple (8233,68) in relation "filearr"
Actions #10

Updated by pjessen about 3 years ago

  • Status changed from Feedback to In Progress

Okay, another few deadlocks and restarts, but it looks like I'm getting there. Only 4512606 rows to go.

Actions #11

Updated by pjessen about 3 years ago

pjessen wrote:

Okay, another few deadlocks and restarts, but it looks like I'm getting there. Only 4512606 rows to go.

2021-03-18 19:00 Uh, correction - 4512606 is the number of rows done ..... 112209127 to go.
2021-03-19 11:00 9450577 done.
2021-03-21 09:47 18606065 done (not running continually since last count).

I'll have to come up with a script for this, but why is it so slow?

Actions #12

Updated by pjessen about 3 years ago

Still working on this. Small steps at a time.

Actions #13

Updated by pjessen about 3 years ago

Okay, I have manually updated everything that isn't tumbleweed and isn't repositories. Now 27'388'500 of 117'642'160 rows updated, approx 23%.

Actions #14

Updated by pjessen about 3 years ago

  • % Done changed from 0 to 20

It is taking too long - 30'590'144 rows so far. I'll do some more manual updates.

Actions #15

Updated by pjessen about 3 years ago

  • Due date set to 2021-04-26
  • % Done changed from 20 to 30

Have manually updated the rest of tumbleweed, 15'328'361 rows, leaving only repositories.

Actions #16

Updated by pjessen almost 3 years ago

  • Due date changed from 2021-04-26 to 2021-05-28
  • % Done changed from 30 to 40

Update - 59'263'266 rows of 127'726'101 done. 46%.

Actions #17

Updated by pjessen over 2 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 40 to 100
Actions

Also available in: Atom PDF