https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-06-09T08:12:42ZopenSUSE Project Management ToolopenSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4144792021-06-09T08:12:42Zhellcphel@lcp.world
<ul><li><strong>Private</strong> changed from <i>Yes</i> to <i>No</i></li></ul> openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4144822021-06-09T08:16:26Zpjessenper@computer.org
<ul></ul><p>I tried restarting it twice, both resulted in a segfault. </p>
<p>The same thing repeatedly:</p>
<pre><code># grep segfault /var/log/messages
2021-06-05T10:56:53.856510+00:00 mirrordb1 kernel: [1344517.476156] postgres[21877]: segfault at 5626bf85dffa ip 00007ff966d76147 sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-05T12:03:59.645338+00:00 mirrordb1 kernel: [1348543.361004] postgres[17228]: segfault at 5626bf85dffa ip 00007ff966d76133 sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-05T12:07:44.854244+00:00 mirrordb1 kernel: [1348768.627274] postgres[18005]: segfault at 5626bf85dffa ip 00007ff966d76151 sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-05T12:12:10.410150+00:00 mirrordb1 kernel: [1349034.190949] postgres[19194]: segfault at 5626bf85dffa ip 00007ff966d76133 sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-05T12:18:45.116673+00:00 mirrordb1 kernel: [1349428.912847] postgres[20605]: segfault at 5626bf85dffa ip 00007ff966d76151 sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-05T12:26:47.459836+00:00 mirrordb1 kernel: [1349911.271712] postgres[22555]: segfault at 5626bf85dffa ip 00007ff966d7612e sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-05T12:33:10.156571+00:00 mirrordb1 kernel: [1350293.982984] postgres[24549]: segfault at 5626bf85dffa ip 00007ff966d7614c sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-08T15:26:17.358152+00:00 mirrordb1 kernel: [1619890.699769] postgres[4776]: segfault at 5626bf85dffa ip 00007ff966d76147 sp 00007ffd90595438 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-08T18:28:07.089464+00:00 mirrordb1 kernel: [1630800.817691] postgres[28928]: segfault at 5626bf85dffa ip 00007ff966d76151 sp 00007ffd90595438 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-08T23:04:58.963109+00:00 mirrordb1 kernel: [1647413.271332] postgres[30652]: segfault at 5626bf85dffa ip 00007ff966d76138 sp 00007ffd90595538 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-09T03:25:10.534559+00:00 mirrordb1 kernel: [1663025.397677] postgres[28557]: segfault at 5626bf85dffa ip 00007ff966d76147 sp 00007ffd90595438 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-09T04:58:06.275839+00:00 mirrordb1 kernel: [1668601.338594] postgres[29635]: segfault at 5626bf85dffa ip 00007ff966d7612e sp 00007ffd90595438 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-09T04:59:07.199999+00:00 mirrordb1 kernel: [1668662.268971] postgres[999]: segfault at 5626bfbeabf2 ip 00007ff966d73700 sp 00007ffd90592918 error 4 in libc-2.31.so[7ff966c30000+1cb000]
2021-06-09T07:51:34.897584+00:00 mirrordb1 kernel: [1679010.333734] postgres[24774]: segfault at 5618704bfc92 ip 00007ff512413700 sp 00007ffeae458f08 error 4 in libc-2.31.so[7ff5122d0000+1cb000]
2021-06-09T07:59:14.420472+00:00 mirrordb1 kernel: [1679469.872615] postgres[29702]: segfault at 5576df7d1c92 ip 00007f444ba53700 sp 00007fff8a7ecce8 error 4 in libc-2.31.so[7f444b910000+1cb000]
2021-06-09T08:05:50.521976+00:00 mirrordb1 kernel: [1679865.986628] postgres[1349]: segfault at 55b0a9cb1c82 ip 00007f2aacf9b700 sp 00007ffd2bed75f8 error 4 in libc-2.31.so[7f2aace58000+1cb000]
</code></pre> openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4145722021-06-09T09:08:20Zpjessenper@computer.org
<ul></ul><p>I'm no wizard at debugging postgres, but at the first segfault this morning:</p>
<pre><code>2021-06-09 03:25:10.552 UTC [1159]: [302-1] db=,user= LOG: server process (PID 28557) was terminated by signal 11: Segmentation fault
2021-06-09 03:25:10.552 UTC [1159]: [303-1] db=,user= DETAIL: Failed process was running: SELECT COUNT(mirr_del_byid(169, id) order by id) FROM temp1
2021-06-09 03:25:10.552 UTC [1159]: [304-1] db=,user= LOG: terminating any other active server processes
</code></pre> openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4146082021-06-09T10:04:42Zmstriglmarco.strigl@suse.com
<ul></ul><p>We are currently working on it.</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4146232021-06-09T10:45:45Zmstriglmarco.strigl@suse.com
<ul></ul><p>postgresql on mirrordb1 is up again.<br>
We used a snapshot from yesterday.</p>
<p>I enabled the core dump writing on mirrordb1 to get an initial core dump the next time.</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4157742021-06-12T07:35:19Zhellcphel@lcp.world
<ul></ul><p>Aaaaand it's down again</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4157772021-06-12T07:57:40Zpjessenper@computer.org
<ul></ul><p>Looks similar to last time:</p>
<pre><code>2021-06-11 18:46:12.227 UTC [24774]: [27-1] db=,user= LOG: server process (PID 10410) was terminated by signal 11: Segmentation fault
2021-06-11 18:46:12.227 UTC [24774]: [28-1] db=,user= DETAIL: Failed process was running: SELECT COUNT(mirr_del_byid(583, id) order by id) FROM temp1
2021-06-11 18:46:12.227 UTC [24774]: [29-1] db=,user= LOG: terminating any other active server processes
</code></pre> openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4159812021-06-14T12:31:42Zandriinikitinandrii.nikitin@suse.com
<ul></ul><p>Bernhard did make backup of datadir today and I decided to try resetting write ahead log and see if it helps (It looses some last transactions, but I don't think we have better option).<br>
After <code>postgres@mirrordb1:~> pg_resetwal -f /var/lib/pgsql/data</code> (as postgres user) - the service was able to start</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4161312021-06-14T18:50:32Zcboltzsuse-beta@cboltz.de
<ul></ul><p>For the records: andriinikitin migrated the databases to mirrordb2 and changed pgbouncer on anna/elsa so that it now uses mirrordb2. So for now, everything works again.</p>
<p>The reason for the segfaults and for allowing duplicate rows in unique indexes are still unclear.</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4164642021-06-15T17:34:01ZKaratekHD
<ul></ul><p>It seems to be down again, at least Matrix is which was affected by the last downtime too</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4164672021-06-15T17:39:02Zpjessenper@computer.org
<ul></ul><p>KaratekHD wrote:</p>
<blockquote>
<p>It seems to be down again, at least Matrix is which was affected by the last downtime too</p>
</blockquote>
<p>Confirmed. </p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4165032021-06-15T20:21:27Zbmwiedemannbmwiedemann@suse.de
<ul></ul><p>Current path is downgrade to postgresql12 on mirrordb2, re-import SQL dumps, ensure unique indexes on mirrorbrain's filearr(path) entries exist and are working this time.</p>
<p>Meanwhile, I had edited download.o.o DNS to have 2 A and 2 AAAA records to shift 50% of load to mirrorcache to avoid overload of either of them.<br>
Without DB, download.o.o would point every user to its downloadcontent.o.o alias and cause packet-loss and mirrorcache is not yet optimized enough to handle 300 requests per second.</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4165242021-06-16T00:47:15Zandriinikitinandrii.nikitin@suse.com
<ul></ul><p>mirrorcache.o.o has run out of disk space, and has some other issues, so I reverted the DNS change because download.o.o redirects to mirrors properly at the moment</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4166112021-06-16T06:09:16Zbmwiedemannbmwiedemann@suse.de
<ul><li><strong>% Done</strong> changed from <i>0</i> to <i>10</i></li></ul><p>Filed <a href="https://bugzilla.opensuse.org/show_bug.cgi?id=1187392" class="external">https://bugzilla.opensuse.org/show_bug.cgi?id=1187392</a> for our postgresql13 segfault</p>
openSUSE admin - tickets #93686: Postgres currently downhttps://progress.opensuse.org/issues/93686?journal_id=4282032021-07-19T08:46:58Zlrupp
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Closed</i></li><li><strong>% Done</strong> changed from <i>10</i> to <i>100</i></li></ul><p>Issue is meanwhile solved by using the latest backup and a clean postgresql12 installation. Closing here.<br>
For further reference, please check <a href="https://bugzilla.opensuse.org/show_bug.cgi?id=1187392" class="external">https://bugzilla.opensuse.org/show_bug.cgi?id=1187392</a></p>