Project

General

Profile

tickets #2302

Hotstuff modules empty on stage.opensuse.org

Added by Anonymous about 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
Mirrors
Target version:
-
Start date:
2014-04-11
Due date:
2014-04-15
% Done:

100%

Estimated time:
2.00 h

Description

All hotstuff modules on stage.opensuse.org are currently empty.

The update-rsync-notify.py script hangs in a wait state and currently does : nothing.

But all mirrors using our recommended rsync commandline are now also empty. That does not help to distribute the load. As result, I disabled the hotstuff modules now on rsync.opensuse.org and on stage.opensuse.org.

Coolo told me now, that the script is analysing the logs that are produced on stage.opensuse.org once they are synced to langley.

Now I have 3 questions/remarks:
1) could it be that the script that uploads the results (where is this one running?) misses some sanity checks ?
2) why is this analyser not running directly on stage.opensuse.org ?
3) langley has current logfiles, so why does it still not work ?

History

#1 Updated by Anonymous about 8 years ago

  • Private changed from Yes to No

#2 Updated by Anonymous about 8 years ago

  • Description updated (diff)

#3 Updated by aplanas about 8 years ago

  • Status changed from New to In Progress

1) could it be that the script that uploads the results (where is this one running?) misses some sanity checks ?

Absolutely. I am digging into it during this morning. I am running all the process from scratch to detect the problem. I will update the issue with what I found.

2) why is this analyser not running directly on stage.opensuse.org ?

The current architecture is a bit more convoluted. There are four machines involved:

1.- Calloway. The old Cab Calloway is the only machine with rights to access the mirrordb database (the IP is in the postgresql white list). This machine runs, using crontab, the script 'mirror_brain', used to create the list of files (and sizes) candidates to be mirrored. Using 'crontab -e' we can see the entry:

0 3 * * * ( cd /suse/aplanas/Documents/09-rsyncd ; ./mirror_brain )

2.- Lena. My desktop. Is the one that wait Calloway and run the KP algorithm. Also is the one that create the 'payload' file from the logs files in langley. This is used to deduce the importance of every file. With the mirror_brain file and the payload file, KP can deduce the mirror files

3.- pontifex2-opensuse. There is a script running that takes care of sync the different directories that contains result of KP. The current setup was improved by coolo in December, but basically there is something like this:

ssh mirror@pontifex2-opensuse
ps aux | grep update-rsync-notify.py

The update-rsync-notify.py is in /srv/rsync-modules/ and need to be run like this

python /srv/rsync-modules/update-rsync-notify.py --src /srv/ftp-stage/pub/opensuse/ --srcalt /srv/ftp/pub/opensuse/ --dst /srv/rsync-modules/ --lists /srv/rsync-modules/result-lists/ 30g 80g 160g 320g 640g > /srv/rsync-modules/update-rsync.log

The service can be found in

/etc/sv/update-rsync-modules

4.- External rsync. I don't know this machine because was completely managed by coolo, but the setup is the same that pontifex2-opensuse.

3) langley has current logfiles, so why does it still not work ?

I will update this task with my findings. I am running the scripts in 1), 2) and 3) at hand now (the one in 2 is slow)

#4 Updated by aplanas about 8 years ago

Issues found:

1.- We upload the KP results (list of files for different KP sizes)
into pontifex2 and widehat. Works on pontifex2 but not in widehat:

$ rsync -avz rsyncd-launch/* knapsacks@widehat-opensuse.suse.de::put-knapsacks
...
@ERROR: Unknown module 'put-knapsacks'

This put-knapsacks module is present in pontifex2-opensuse and is working.

#5 Updated by aplanas about 8 years ago

I do not have access to stage.opensuse.org, but as I can see is working Ok in pontifex2-opensuse.

#6 Updated by Anonymous about 8 years ago

pontifex2 is hosting stage

widehat can be fixed.

#7 Updated by coolo about 8 years ago

widehat/rsync.opensuse.org was reinstalled and lost the rsync module.

#8 Updated by aplanas about 8 years ago

darix wrote:

pontifex2 is hosting stage
widehat can be fixed.

Sorry, I do not know the external names of the servers. Also, why the h1 here? I am wearing my glasses to read.

As I said pontifex2 (stage) looks like that is working Ok. An easy way to check this is using dates:

cd /srv/rsync-modules/30g/update/13.1/x86_64
ls -lt | less

As you can see there are files from today 14 (apache2 update), Friday 11 (curl) or Thursday 10 (libmount1)

The error report says that "All hotstuff modules on stage.opensuse.org are currently empty." but I see the data of pontifex2 inside the server and outside:

lena> rsync -a stage.opensuse.org::opensuse-hotstuff-30gb . --stats -n -h

So no idea of what the problem is, apart from the widehat one, but this is not compatible with the description of this issue.

Can someone provide some feedback?

#9 Updated by aplanas about 8 years ago

  • % Done changed from 0 to 90

#10 Updated by aplanas about 8 years ago

This morning 'put-knapsacks' module was still missing in widehat. I do not have rights to create it. Please, ping me when it is there and I will relaunch the synchronization.

Another thing that need to be checked in widehat is that there is a service /etc/service/update-rsync-modules like in pontifex2-opensuse. I tried to check this myself but I do not have access to the machine as mirror user.

#11 Updated by aplanas about 8 years ago

  • Status changed from In Progress to Feedback

#12 Updated by aplanas about 8 years ago

After darix instructions, change pontifex2 to pontifex3 in the put-knapsack script. Widehat still not working.

#13 Updated by aplanas about 8 years ago

After hitting me so bad, please, do not put this task into oblivion. Widehat still do not have the put-knapsack module.

Is this still relevant?

#14 Updated by Anonymous about 8 years ago

I'm out of the office until Thursday, 14th of April 2014.

During my absence, please contact

  • autobuild@suse.de for all questions around Autobuild and the Build Service
  • ops-services@suse.de for all questions around OPS related tasks
  • Andreas Mach as my deputy for all other questions

With kind regards
Lars Vogdt

--
Lars Vogdt

  • OPS Engineering Services Team Lead - SUSE Linux Products GmbH - GF: Jeff Hawn, Jennifer Guild, Felix Imend├Ârffer Maxfeldstra├če 5, 90409 Nuernberg, Germany - HRB 16746 (AG Nuernberg)

04/16/14 09:27 >>>

[openSUSE Tracker]
Issue #2302 has been updated by aplanas.

After hitting me so bad, please, do not put this task into oblivion. Widehat still do not have the put-knapsack module.

Is this still relevant?


tickets #2302: Hotstuff modules empty on stage.opensuse.org
https://progress.opensuse.org/issues/2302#change-9080

  • Author: lrupp
  • Status: Feedback
  • Priority: Urgent
  • Assignee: aplanas
  • Category: Mirrors
  • Target version:

* Due subtask: 15/04/2014

All hotstuff modules on stage.opensuse.org are currently empty.

The update-rsync-notify.py script hangs in a wait state and currently does : nothing.

But all mirrors using our recommended rsync commandline are now also empty. That does not help to distribute the load. As result, I disabled the hotstuff modules now on rsync.opensuse.org and on stage.opensuse.org.

Coolo told me now, that the script is analysing the logs that are produced on stage.opensuse.org once they are synced to langley.

Now I have 3 questions/remarks:
1) could it be that the script that uploads the results (where is this one running?) misses some sanity checks ?
2) why is this analyser not running directly on stage.opensuse.org ?
3) langley has current logfiles, so why does it still not work ?

--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://progress.opensuse.org/my/account

#15 Updated by aplanas about 8 years ago

Any feedback here?

#16 Updated by aplanas about 8 years ago

Creqting the RPM as requested.

#17 Updated by aplanas about 8 years ago

widehat still do not have put-knapsack module. Feedback, please?

#18 Updated by Anonymous almost 8 years ago

  • Parent task set to #2280

#19 Updated by aplanas almost 8 years ago

Explain the details about KP (calloway, lena and pontifex) to Gerhard Schlotter gschlotter@suse.com

Send important links and some notes.

#20 Updated by aplanas over 7 years ago

  • Status changed from Feedback to Resolved

#21 Updated by Anonymous over 7 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 90 to 100

Also available in: Atom PDF