Lync 2013 Front End Service – Starting

This is a bit of an obvious post but I thought I’d add it for clarity and to ensure that people are confident in completing this process. I think it is something that Lync admins may well need to do when they come across this issue that would not have been a problem in previous versions of the product.

I recently came across an issue where a single Front End server in an Enterprise Edition Pool could not start its Lync Front End service. The service showed as “Starting”. It seemed that this server was serving some user log on requests but there were what seemed to be intermittent issues with the pool. I had no known recent history of the pool but I had been told that no changes had been made and all servers and services had previously been available. The server in question was connected to the network and could ping the other 2 Front End servers in the pool as well as the SQL servers. There seemed to be no network issues at all.

This server was reporting the following error:

“Event 32174, LS User Services

Server startup is being delayed because fabric pool manager has not finished initial placement of users.

Currently waiting for routing group: {<Routing Group ID>}.
Number of groups potentially not yet placed: 3.
Total number of groups: 3.
Cause: This is normal during cold-start of a Pool and during server startup.
If you continue to see this message many times, it indicates that insufficient number of Front-Ends are available in the Pool.
Resolution:
During a cold-start of a large Pool it can take upto an hour for the placement process to finish as it needs to populate all the Front-End databases with data from the Backup Store. If the Pool is running and the Front-End is just started, this is normal for some time. If this repeats for a long time, ensure that all the Front-Ends configured for this Pool are up and running. If multiple Front-Ends have been recently decommissioned, run Reset-CsPoolRegistrarState -ResetType QuorumLossRecovery to enable the Pool to recover from Quorum Loss and make progress.”

I ensured that there were no non-self-signed certificates in the Trusted Root Certificate Authorities store. I then decided to run the command that is identified in the error message: Reset-CsPoolRegistrarState. There were multiple pools in the environment so my command looked like this:

Reset-CsPoolRegistrarState –poolfqdn <Pool FQDN> -ResetType QuorumLossRecovery

This command restarts the services on all Front End servers in the pool so it should be considered to be service impacting. However, once this command had completed, and after a few minutes, all Lync services on all Front End servers had restarted successfully and the pool began operating as normal again.

I don’t know what caused this issue in this case, but I would like to re-iterate that restarting Lync Front End servers should not be considered as a trivial thing to do. If you need to restart a Front End server, make sure that it comes back up cleanly and settles down before considering restarting any other Front End servers in the pool. This is to ensure that the Windows Fabric continues to run in the background; it needs a minimum number of servers to be available from the pool at all times. If you need to shut the pool down, remember the order in which you shut down the Front End servers. When you bring the pool back up, boot the Front End servers in the opposite order to which they were shut down.

About the author