vCloud Director cells fail after vCenter reboot

For some reason, it maybe needed to reboot a vCenter once in a while. Especially a windows-based vCenter when the windows updates need to be executed. When using vCloud Director in combination with that specific vCenter Server, it might happen that the vCloud Director Cells break their connection to the vCenter Server. Every vCD Cell has a vCenter Proxy that connects to the vCenter to perform management operations like the creation of a new VM or vApp or the connection to the VM console.

When running through the logfiles of vCloud Director, the following errors might be there:

VC Listener disconnected
 - org.apache.http.conn.HttpHostConnectException: Connection to https://vcenter.customer.local:443 refused
 - Connection to https://vcenter.customer.local:443 refused
 - Connection refused
 - org.apache.http.conn.HttpHostConnectException: Connection to https://vcenter.customer.local:443 refused
 - Connection to https://vcenter.customer.local:443 refused
 - Connection refused

The way to solve this issue, is to reboot the Cells and in a controlled fashion. On every Cell, perform the next procedure:

1. Connect to the console of the first cell via SSH

2. Navigate to $VCLOUD_HOME/bin/

cd $VCLOUD_HOME/bin/

3. Suspend the scheduler by running this command:

./cell-management-tool -u username-p password cell -q true

4. View the tasks that are running using this command:

./cell-management-tool -u username -p password cell -t

Repeat step 4 until the task count reaches zero.

5. Shut down the cell by running this command:

./cell-management-tool -u username -p password cell -s

6. Check if the vCloud Director service has been stopped:

service vmware-vcd status

7. Restart the server:

Reboot

8. After the reboot, check the cell.log if the cell has been restarted (application initialization has reached 100%):

tail -f /opt/vmware/vcloud-director/logs/cell.log

9. The next cell can be rebooted by repeating step 1 to 8 for the next cells (one at a time).

After the reboots of all cells, the vCenter Proxy on those cells are active again and customers are yet again able to connect to their VMs and run management operations.

Johan van Amersfoort