Suddenly our vCenter in Production Environment stop working and had an web ui message "no healthy upstream"
Log Analysis:
From the logs, we confirmed this is because of vmdird being in read-only state. We found out that vsphere-ui is not starting because pre start command is failing on "create_and_assign_vsphere_client_solution_user_role":
2023-01-20T02:27:38.208Z Wa(03)+ host-1972 File "/usr/lib/vmware-vsphere-ui/firstboot/vsphere_ui_prestart.py", line 80, in <module>
2023-01-20T02:27:38.208Z Wa(03)+ host-1972 raise ex
2023-01-20T02:27:38.208Z Wa(03)+ host-1972 File "/usr/lib/vmware-vsphere-ui/firstboot/vsphere_ui_prestart.py", line 77, in <module>
2023-01-20T02:27:38.208Z Wa(03)+ host-1972 create_and_assign_vsphere_client_solution_user_role(VSPHERE_UI_FIRSTBOOT_CONFIG_DIR)
2023-01-20T02:27:38.208Z Wa(03)+ host-1972 File "/usr/lib/vmware-vsphere-ui/firstboot/solution_user_permission_utils.py", line 59, in create_and_assign_vsphere_client_solution_user_role
Checking Authorization we find that updating of role is failing due to ldap failed command:
2023-01-20T15:27:38.189+13:00 [dataservice-1 ERROR com.vmware.cis.core.authz.accesscontrol.impl.AuthorizationServiceInternalImpl opId=ac7e7b11-deb5-4687-9608-42285a1bef4c] Store Exception
com.vmware.cis.core.authz.exception.StoreException: java.lang.Exception: Failed to add entry to lotus
From vmdird logs, ldap modify operations are failing because vmdird is in read only mode:
2023-01-20T15:27:38.194769+13:00 err vmdird t@139636651300608: InternalModifyEntry: VdirExecutePostModifyCommitPlugins - code(9114)
2023-01-20T15:27:38.194851+13:00 err vmdird t@139636651300608: VmDirSendLdapResult: Request (Modify), Error (LDAP_UNWILLING_TO_PERFORM(53)), Message (Server in read-only mode), (0) socket (127.0.0.1)
Issue:
• vSphere-ui in stopped state
Findings:
• We found out that the vsphere-ui is in stopped state.
Fix:
• We run vdcadmintool to confirm the vmdird state
• Login to VCSA Appliance through SSH then Run this /usr/lib/vmware-vmdir/bin/vdcadmintool and select Number 6 to get the vmdir state
• Run ./fixpsc.sh script on the problematic vCenter with the healthy vCenter as reference. After this, we were able to start vsphere-ui service.
• Run one more time this /usr/lib/vmware-vmdir/bin/vdcadmintool and select Number 6 to get the vmdir state
• Now we can see that the VmDir State is Normal
• At this point of time you can now access your vCenter Web UI successfully
Note: This is my personal and owned experience that I love to share in the VMware Community. I hope you appreciate it, thank you
Comments