Hi
I'm working on a large CRM 2011 installation that has recently been experiencing issues following a document import task whereby there are thousands of related System Event jobs in the System Jobs queue that are in a "Waiting For Resources" state and taking an age to process. Yesterday it took almost the entire day to work its way through ~1000 such jobs - and when those had finally been processed by this morning, they were replaced by ~2000 more.
I've tried restarting the Async Processing service, and whilst the jobs do briefly change to "In Progress", a short while later they revert back to "Waiting for Resources". Whilst investigating this, I noticed that in the event viewer on the Async Processing server, there are repeated error messages from MSCRMAsyncService$maintenance reporting an issue connecting to the database - these are occurring at a rate of ~3-4 per hour:
Host <servername>.MSCRMAsyncService$maintenance.334f7029-2f26-42ee-8d1d-1d3cf4d2b2e1: a config database error occured. Exception: System.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) ---> System.ComponentModel.Win32Exception (0x80004005): The network path was not found
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource`1 retry)
at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
at System.Data.SqlClient.SqlConnection.Open()
at Microsoft.Crm.CrmDbConnection.GetCreateAndOpenConnection()
at Microsoft.Crm.CrmDbConnection.Open()
at Microsoft.Crm.Asynchronous.JobDataAccess.ExecuteSqlOrganizationScopeAndProcessRecords(IDbCommand command, Guid organizationId, RecordProcessor recordProcessor)
at Microsoft.Crm.Asynchronous.JobDataAccess.RetrieveSqlServerName(Guid orgId)
at Microsoft.Crm.Asynchronous.JobDataAccess.UpdateJobTargetServer(AsyncJob job)
at Microsoft.Crm.Asynchronous.JobDataAccess.GetNextJob(IList`1 orgsAvailableForMaintenance, DateTime startCycleTime, Int32 maxJobsToReturn)
ClientConnectionId:00000000-0000-0000-0000-000000000000
Error Number:53,State:0,Class:20
This is a system that has been in operation for years and nothing has changed with regards to the CRM databases (i.e. the org database and the MSCRM_CONFIG database) - they are the same databases on the same servers. The normal MSCRMAsyncService process is not reporting any such errors.
I have two main areas I'm trying to understand:
- How can the maintenance service be reporting errors when the normal async service is not? I thought they would connect to the same databases using the same connection method and the same connection string (i.e. registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSCRM\configdb)?
- Presumably the tasks that would normally get undertaken by the maintenance service will not be getting undertaken due to this issue? Does the maintenance service maintain a log anywhere that shows what operations were completed, or attempted, and what the outcome was? And is there any way to manually kick off these tasks so as to try and improve performance?
Any advice or pointers would be very much appreciated.
Thanks