action #52487

[webui] osd: audit log page takes 5 seconds to render a page

Added by coolo 9 months ago. Updated 6 months ago.

Status:ResolvedStart date:03/06/2019
Priority:NormalDue date:
Assignee:mkittler% Done:

0%

Category:Feature requests
Target version:Done
Difficulty:
Duration:

Description

It's possible we just have too many audit logs - but maybe an index can help, or we need to evaluate retention policies

History

#1 Updated by okurz 8 months ago

  • Subject changed from osd: audit log page takes 5 seconds to render a page to [webui] osd: audit log page takes 5 seconds to render a page
  • Category changed from 124 to Feature requests

#2 Updated by mkittler 8 months ago

  • Status changed from New to Feedback
  • Assignee set to mkittler

Trying to improve performance by adding additional indexes is likely an overkill (we already have an index on the user_id column). So I would go for "retention policies".

I guess our current "retention policy" is to keep those logs forever. So we currently have 2,624,096 entries on OSD (the oldest is from 4 years ago).

The simplest solution is adding a 'Delete logs older then ...' button on the audit log page. I could also make this a Minion job which is executed from time to time and deletes all logs older than an age specified in the config file.

#3 Updated by coolo 8 months ago

Not sure what the index for the user id is for. But we paginate by date, no? So shouldn't we also have an index for that?

#4 Updated by mkittler 8 months ago

Not sure what the index for the user id is for.

Likely DBIx added it automatically for some relation.

And yes, by default the table is sorted by t_created. Not sure whether adding an index speeds up pagination, though. But I guess simply cleaning old entries up makes more sense.

#5 Updated by coolo 8 months ago

I think one doesn't exclude the other. 2.6 Million rows is a lot, but nothing to a well indexed database :)

And the question is how far you want to keep the data. job posts don't interest after a week or, iso posts after a month, table edits - well - 2 years or so?

#6 Updated by mkittler 7 months ago

  • Status changed from Feedback to In Progress
  • Target version changed from Ready to Current Sprint

Ok, then I'll start with the retention policy and check how much performance an index will gain. I guess the mentioned limits make sense but like the other limits this should likely be configurable.

#7 Updated by mkittler 7 months ago

#8 Updated by mkittler 7 months ago

The PR has been merged. The next step is providing the salt config to enable the cleanup on OSD.

Then I can check whether adding an index is worth it.

#9 Updated by mkittler 7 months ago

  • Status changed from In Progress to Resolved

OSD is now configured to run the cleanup and it seems to work. The audit events table loads now a little bit faster. If required, we can still reduce the storage duration (of certain event types).

I also tested locally whether adding an index would improve the speed. By default the rows are sorted by t_created so I added an index for this column: https://github.com/Martchus/openQA/pull/new/audit-events-index

At least locally (with OSD data) this does not make it notably faster (checked query times with Chromium dev tools and nytprof). Admittedly my local machine is much faster than OSD but I still doubt an index will be beneficial.

(And of course I checked whether the index is actually present with SELECT tablename, indexname, indexdef FROM pg_indexes WHERE schemaname = 'public' and tablename = 'audit_events';.)

So I would mark this issue as resolved. I'll keep the branch for adding the index as a reference for adding indexes.

#10 Updated by mkittler 6 months ago

  • Target version changed from Current Sprint to Done

Also available in: Atom PDF