action #69577
closedopenQA Project - coordination #64746: [saga][epic] Scale up: Efficient handling of large storage to be able to run current tests efficiently but keep big archives of old results
openQA Project - coordination #80546: [epic] Scale up: Enable to store more results
Handle installation of the new "Storage Server"
0%
Description
We received our new Storage Server which we want to connect to openQA. It got delivered to SUSE with attention to Ralf Unger and is located in the Nuremberg post-office. It has 2x 10GBit/s RJ45 ports which need an according uplink. As this machine will communicate with openQA (a VM on the OBS cluster) it might make sense to bring it close to this cluster (wherever this is located). If 10G does not work out at all we could start with 1G for now.
Please make sure to somehow obtain a copy of the invoice which is normally taped on the outside of the parcel and send it to @mgriessmeier so he can release the pay for our server. already done
So the task would be to open an infra ticket and ask them nicely to put the machine into the server room. You might also need to discuss how to connect the 10Gbit/s port since infra has no 10G hardware (AFAIK).
At the very least the machine needs to be moved away from the post office :)
Updated by nicksinger over 4 years ago
- Assignee set to okurz
- Priority changed from Normal to Urgent
@okurz: I assign this for you for now because I have full trust that you find the right person and maybe some volunteer. The urgency reflects only that it needs to move out of the post-office - my office, the labs or a proper rack inside the server room would all work out to reduce the prio to "low" ;)
Updated by okurz over 4 years ago
- Due date set to 2020-08-11
- Status changed from Workable to Blocked
Updated by okurz over 4 years ago
- Due date changed from 2020-08-11 to 2020-08-26
- Assignee changed from okurz to nicksinger
- Priority changed from Urgent to Normal
- Target version set to Ready
we have the ticket, it was assigned but has seen no update, not sure if there has been actual action. nsinger to check later.
Updated by nicksinger about 4 years ago
- Due date changed from 2020-08-26 to 2020-09-04
It got escalated to Ralf when I came back from vacation. Now Gerhard mentioned that the team is back to full capacity next week and supposedly scheduled it to do it then. Therefore I raise the "due date".
Updated by okurz about 4 years ago
- Status changed from Blocked to In Progress
as you mentioned the machine is mounted in the rack and accessible now. Please see about the task in https://infra.nue.suse.com/SelfService/Display.html?id=175645 as well as you are on it already.
Updated by nicksinger about 4 years ago
- Status changed from In Progress to Workable
Keeping me assigned but setting the ticket to "Workable" as I'm currently not working on it. Whoever wants to give it a try can simply unassign me
Updated by okurz about 4 years ago
- Related to action #44078: Implement proper backups for o3 size:M added
Updated by livdywan about 4 years ago
- Due date changed from 2020-09-04 to 2020-11-30
See #76972 for the request for additional resources.
Updated by nicksinger almost 4 years ago
- Status changed from Workable to In Progress
The OS is installed now and reachable over ssh with its IP 10.160.66.189
We still need to decide on how to setup the storage. Mainly raid level and technology (mdadm raid, btrfs raid, FS).
Updated by livdywan almost 4 years ago
nicksinger wrote:
The OS is installed now and reachable over ssh with its IP 10.160.66.189
We still need to decide on how to setup the storage. Mainly raid level and technology (mdadm raid, btrfs raid, FS).
Maybe this makes sense to discuss in the Weekly? With perhaps the related point raised by @mkittler in os-autoinst/openQA#3635(poo#88121)
Updated by okurz almost 4 years ago
- Related to action #66709: Storage server for OSD and monitoring added
Updated by okurz almost 4 years ago
- Target version changed from future to Ready
Updated by nicksinger almost 4 years ago
- Target version changed from Ready to future
Discussed in the weekly:
- btrfs fs/raid
- nfs4 export
- hostname: storage
- basic salt integration (e.g. ssh)
Updated by nicksinger almost 4 years ago
- Target version changed from future to Ready
Updated by livdywan almost 4 years ago
- Due date changed from 2020-11-30 to 2021-02-12
Updated by livdywan almost 4 years ago
Since we were talking about the server in the daily I did a bit of smoke testing. I can ssh storage.qa.suse.de
as my user, sudo
works.
Just noticed one oddity - I can use sudo
for just wondering if that might point to some other configuration issue:
$ dmesg
dmesg: read kernel buffer failed: Operation not permitted
@nicksinger is enabling the NFS share next
Updated by nicksinger almost 4 years ago
cdywan wrote:
Since we were talking about the server in the daily I did a bit of smoke testing. I can
ssh storage.qa.suse.de
as my user,sudo
works.Just noticed one oddity - I can use
sudo
for just wondering if that might point to some other configuration issue:$ dmesg dmesg: read kernel buffer failed: Operation not permitted
@nicksinger is enabling the NFS share next
Seems like there was a setting introduced in newer kernel named "kernel.dmesg_restrict". This is enabled on the storage server while it is disabled on all our workers. I assume this is mainly caused by a newer installation on there. I wouldn't bother changing it as we have root access anyway :)
Updated by nicksinger almost 4 years ago
- Status changed from In Progress to Resolved
an nfs share named /storage
is now enabled. I did a quick test from OSD where I was successfully able to mount it and write to that share. You can try it yourself by running mount -t nfs4 storage.qa.suse.de:/storage /storage
.
Updated by okurz almost 4 years ago
- Copied to action #88546: Make use of the new "Storage Server", e.g. complete OSD backup added
Updated by okurz over 3 years ago
- Copied to action #90629: administration of the new "Storage Server" added
Updated by okurz over 3 years ago
- Related to action #93683: osd-deployment failed due to storage.qa.suse.de not reachable by salt added
Updated by nicksinger almost 2 years ago
- Related to action #121282: Recover storage.qa.suse.de size:S added