-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The error message for rook-ceph-mon-d is:debug 2024-09-29T05:56:05.330+0000 7f8b5543f700 1 mon.d@2(electing) e5 handle_auth_request failed to assign global_id #12 #329
Comments
@shuaigea are you trying to restore the mon quorum and having some issues? It's not clear from the issue description |
Hello, the problem occurred when I added a new hard drive during expansion. The new hard drive has a different transfer speed from the original Jiu hard drive of the same brand. The Jiu hard drive has a read speed of 7000mb/s, which is indeed fast, but the new machine has a read speed of 3500mb/s, which is not satisfactory. However, there is still a slow read alarm. After investigation, it was found that the read speed of the new hard drive is different from that of Jiu hard drive. I would like to know if there is a clear indication during expansion that it should have the same transfer speed? Or is it necessary to use the same brand of hard drive and read/write speed as Jiu to maintain consistency in order to avoid slow reading within what acceptable range of differences in reading speed? |
I'm not sure about this one @travisn @BlaineEXE do you have any about above |
I still don't understand what the problem is. My intuition from reading between the lines is that the disk in question is being used for mon and not osd, but I can't be sure. |
@BlaineEXE It is indeed a problem with mon that is causing slow queries now, but I am not sure if my newly added hard drive is not of the same brand and has the same read and write capabilities as the original disk. My current solution is to kick out the newly added mon. 5 and restore it, but I still have doubts about future expansion. Should the storage capacity be consistent with the original hard drive performance? |
I still don't quite have a full enough understanding to help out here. I know you have added a new hard drive, but hard drives have multiple uses for Ceph, and I can't know exactly how the new drive is getting used. If the new drive was used for an OSD, that shouldn't affect mons, so we will have to look into other causes and other cluster info. If the new drive was used for either the mon PVC, or if it has The only guidance we have from the Ceph documentation on mon disks is this:
The Ceph project doesn't state specific throughput requirements, so I can't say for sure whether the disk's throughput is an issue or not. It is possible that the new disk is simply a faulty (or partly faulty) unit from the factory. |
[WRN] Health check update: 10 slow ops, oldest one blocked for 56 sec, mon.d has slow ops (SLOW_OPS)
9/29/24 4:31:51 PM [WRN] Health check update: 8 slow ops, oldest one blocked for 51 sec, mon.d has slow ops (SLOW_OPS)
9/29/24 4:31:51 PM [WRN] SLOW_OPS: 2 slow ops, oldest one blocked for 46 sec, mon.d has slow ops
9/29/24 4:31:51 PM [WRN] Health detail: HEALTH_WARN 2 slow ops, oldest one blocked for 46 sec, mon.d has slow ops
bash-4.4$ ceph -s
cluster:
id: 93cb51f5-56d6-4045-87d6-6e37d861a83e
health: HEALTH_WARN
1/3 mons down, quorum a,c
services:
mon: 3 daemons, quorum a,c,d (age 0.186447s)
mgr: b(active, since 9w), standbys: a
mds: 1/1 daemons up, 1 hot standby
osd: 6 osds: 6 up (since 2d), 6 in (since 3d)
data:
volumes: 1/1 healthy
pools: 3 pools, 49 pgs
objects: 43.32k objects, 65 GiB
usage: 204 GiB used, 11 TiB / 11 TiB avail
pgs: 49 active+clean
io:
client: 136 KiB/s rd, 1005 KiB/s wr, 4 op/s rd, 64 op/s wr
The text was updated successfully, but these errors were encountered: